# This Page Is Inserted by IFW Operations and is not a part of the Official Record

# **BEST AVAILABLE IMAGES**

Defective images within this document are accurate representations of the original documents submitted by the applicant.

Defects in the images may include (but are not limited to):

- BLACK BORDERS
- TEXT CUT OFF AT TOP, BOTTOM OR SIDES
- FADED TEXT
- ILLEGIBLE TEXT
- SKEWED/SLANTED IMAGES
- COLORED PHOTOS
- BLACK OR VERY BLACK AND WHITE DARK PHOTOS
- GRAY SCALE DOCUMENTS

# IMAGES ARE BEST AVAILABLE COPY.

As rescanning documents will not correct images, please do not report the images to the Image Problem Mailbox.



IMAGE EVALUATION TEST TARGET (MT-3)





OF THE SECTION OF THE

PHOTOGRAPHIC SCIENCES CORPORATION
770 BASKET ROAD
P.O. BOX 338
WEBSTER, NEW YORK 14580
(716) 265-1600

GE THE REAL OF THE PARTY OF THE

# microunity

# Zeus System Architecture

COPYRIGHT 1998 MICROUNITY SYSTEMS ENGINEERING, INC. ALL RIGHTS RESERVED.



Craig Hansen Cnief Architect

MicroUnity Systems Engineering, Inc. 475 Potrero Avenue Sunnyvale, CA 94086.4118 Phone: 408.734.8100 Fax: 408.734.8136

email: craig@n.icrounity.com http://www.microunity.com

# Contents

| Tables and Figures                    |
|---------------------------------------|
| Introduction                          |
| Confermance                           |
| Mandatory and Optional Areas          |
| Upward companile Modefications        |
| Promotion of Optional Festures        |
| Unrestricted Physical Inspiementation |
| Draft Version                         |
| Common Elements                       |
| Notation                              |
| Het ordering 10                       |
|                                       |
| Memory                                |
| Byte10                                |
| Byte ordering 10                      |
| Memory read/load semantics            |
| Memory write/store semantics          |
| Data11                                |
| Fixed point Data12                    |
| Address14                             |
| Floating-priest Data14                |
| Zeus Processor                        |
| Architectural Framework               |
| Interfaces and Block Diagram          |
| Instruction 17                        |
| Assembler Syntax                      |
| Instruction Structure                 |
|                                       |
| Gateway                               |
| User State                            |
| General Registers                     |
| Program Counter                       |
| Provlege Level                        |
| Progrem Counter and Privilege Level3) |
| Systems state                         |
| Fixed-point21                         |
| Load and Store21                      |
| Branch21                              |
| Addressing Operations                 |
| Execution Operations 22               |
| Floeting point                        |
| Hranch Conditionally                  |
| Compare set                           |
| Anthmetic Operations                  |
| Rounding and exceptions 24            |
| NaN handling 25                       |
| Floating-point functions 26           |
| Digital Signal Processing             |
| Data handling Operations              |
| Anthmetic Operations                  |
| Color to 14 (American)                |
| Galois Field Operations39             |
| Software Conventions 40               |
| Register Usage                        |
| Procedure Calling Conventions         |

| Instruction Scheduling                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  | <b> 4</b> 7.                                                               |
|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|----------------------------------------------------------------------------|
| Separate Addressing from Execution                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      | <b>47</b>                                                                  |
| Software Pipeline                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       | 47                                                                         |
| Multiple Issue                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          | 47                                                                         |
| Functional Unit parallelism                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             | 47                                                                         |
| Latency                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                 | 48                                                                         |
| Phpeline Organization                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   | 49                                                                         |
| Classical Procline Structures                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           | 49                                                                         |
| Superstring Pipeline                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    | 50                                                                         |
| Supremona Barbar                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        | 61                                                                         |
| Superthread Pipeline                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    | 52                                                                         |
| Superthread Pipeline                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    | 53                                                                         |
| Branch/fetch Prediction                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                 | <b>4</b>                                                                   |
| Additional Load and Execute Resour                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      | CH., 55                                                                    |
| Result Forwarding                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       | 55                                                                         |
| Instruction Set                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         | 57                                                                         |
| Major Operation Codes                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   |                                                                            |
| Minor Operation Codes                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   | 59                                                                         |
| General Forms                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           | 43                                                                         |
| Instruction Fetch                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       | 64                                                                         |
| Perform Exception                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       | 65                                                                         |
| Instruction Decode                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      | 45                                                                         |
| Ahraya Reserved                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         |                                                                            |
| Address                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                 | 73                                                                         |
| Address Compare                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         | 76                                                                         |
| Address Compare                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         | 79                                                                         |
| Address Immediate Address Immediate Reversed                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            |                                                                            |
|                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         |                                                                            |
| Address Immediate Reversed                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              | 83                                                                         |
| Address Immediate Reversed                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              |                                                                            |
| Address Shift Left Imme have Add                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        | 86                                                                         |
| Address Reversed                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        | 86<br>87                                                                   |
| Address Reversed                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        | 86<br>89<br>90                                                             |
| Address Reversed                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        | 86<br>89<br>90                                                             |
| Address Reversed                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        | 86<br>89<br>90<br>91<br>93                                                 |
| Address Reversed                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        | <b>86</b><br><b>89</b><br><b>90</b><br><b>91</b><br><b>93</b><br><b>94</b> |
| Address Reversed                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        |                                                                            |
| Address Reversed Address Shift Left Imme-bare Add Address Shift Left Immediate Subtract Address Shift Immediate Address Ternary Branch Branch Barner Branch Conditional Branch Conditional Floating Point Branch Down Branch Gateway Branch Halt Branch Halt Branch Halt Branch Halt Branch I lint Immediate                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            |                                                                            |
| Address Reversed Address Shift Left Imme-bare Add Address Shift Left Immediate Subtract Address Shift Immediate Address Ternary Branch Hench Hack Hench Barner Branch Conditional Floating-Point Hench Conditional Floating-Point Hench Conditional Visibility Floating-Phranch I Down Branch Gateway Branch Halt Branch Hint Immediate Hench I Int Immediate Hench I Immediate Branch Immediate Branch Immediate                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       |                                                                            |
| Address Reversed Address Shift Left Imme-bare Add Address Shift Left Immediate Subtract Address Shift Immediate Address Shift Immediate Address Ternary Branch Branch Barner Branch Barner Branch Conditional Floating Point Branch Conditional Floating Point Branch Conditional Visibility Floating Flo |                                                                            |
| Address Reversed Address Shift Left Imme-bare Add Address Shift Left Immediate Subtract Address Shift Immediate Address Ternary Branch Branch Back Branch Barner Branch Conditional Floating-Point Branch Conditional Floating-Point Branch Conditional Visibility Floating-Phranch Down Branch Gateway Branch Halt Branch Halt Branch I lint Immediate Branch I mmediate                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               |                                                                            |
| Address Reversed Address Shift Left Imme-bare Add Address Shift Left Immediate Subtract Address Shift Immediate Address Shift Immediate Address Ternary Branch Branch Back Branch Barner Branch Conditional Floating-Point Branch Conditional Floating-Point Branch Conditional Visibility Floating-Phranch Down Branch Gateway Branch Halt Branch I lint Immediate Branch I limt Immediate Branch Immediate Branch Link Branch Link Load Load Load Immediate                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           |                                                                            |
| Address Reversed Address Shift Left Imme-bare Add Address Shift Left Immediate Subtract Address Shift Immediate Address Ternary Branch Branch Barner Branch Conditional Floating Point Branch Conditional Floating Point Branch Conditional Visibility Floating Parach Down Branch Gateway Branch Halt Branch Hint Immediate Branch Immediate                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            |                                                                            |
| Address Reversed                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        |                                                                            |
| Address Reversed                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        |                                                                            |
| Address Reversed Address Shift Left Imme-bare Add Address Shift Left Immediate Subtract Address Shift Immediate Address Ternary Branch Branch Barner Branch Conditional Floating Point Branch Conditional Floating Point Branch Conditional Visibility Floating Parach Down Branch Gateway Branch Halt Branch Hint Immediate Branch Immediate                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            |                                                                            |

| Group Add                              | 135       | Level One Cache                      | 332 |
|----------------------------------------|-----------|--------------------------------------|-----|
| Group Ada Halve                        | 138       | Level One Cache Stress Control       | 342 |
| Group Hoolean                          | 141       | Level One Cache Redundancy           | 342 |
| Group Compare                          | 148       | Memory Attributes                    | 343 |
| Group Compare Floating-point           | 154       | Cache Control                        |     |
| Group Copy Immediate                   | 157       | Cache Coherence                      | 347 |
| Group Immediate                        |           | Strong Ordering                      | 348 |
| Group Immediate Reversed               |           | Victim Selection                     |     |
| Group Implace                          | 168       | l'etad Access                        | 352 |
| Group Reversed                         |           | Micro Translation Buffer             |     |
| Group Reversed Hosting-point           |           | Block Transletion Buffer             |     |
| Group Shift Left Immediate Add         |           | Program Translation Buffer           |     |
| Group Shift Left Immediate Subtract    | 181       | Global Virtual Cache                 |     |
| Group Subtract Halve                   |           | Memory Interface                     |     |
| Group Temany                           |           | Microarchitecture                    |     |
| Crossbar                               |           | Sacop                                |     |
| Crossbar Extract                       | 192       | Lord                                 |     |
| Crossbar Field                         | 196       | Store                                |     |
| Crossbar Field Implace                 |           | Nemory                               |     |
| Cromber Implace                        |           | Bus interface                        |     |
| Crossbar Short Immediate               |           | Motherboard Chipsets                 |     |
| Crossbar Short Immediate Inplace       |           | Pinout                               |     |
| Crossber Shuffle                       |           | Pin summary                          |     |
| Crossbar Swezzle                       |           | Electrical Specifications            |     |
| Crossbar Ternary                       | 220       | Bus Control Register                 |     |
| Ensemble                               |           | Emulator agnals                      |     |
| Ensemble Convolve Extract Immediate    | 225       | A20M#                                |     |
| Ensemble Convolve Floating point       |           | INIT                                 | 375 |
| Easemble Extract                       | 236       | INTR                                 | 375 |
| Ensemble trauser Immediate             | 244       | NMI                                  | 375 |
| Ensemble Extract Immediate lapher      | 251       | \$/11#                               | 376 |
| Fasemble Floating point                | تنـــــــ | STPCI.K#                             | 376 |
| Ensemble Inplace                       | 261       | IGNNE#                               | 376 |
| Ensemble Inplace Floating-point        | 264       | Emulator output signals              |     |
| Ensemble Reversed Floating-point       | 26?       | Bus snooping                         | 377 |
| Ensemble Ternary                       | 269       | Locked cycles                        |     |
| Ensemble Ternary Floating point        |           | Locked synchronization instruction   | 377 |
| Ensemble Unary                         | 274       | Locked sequences of bus transactions | 378 |
| Ensemble Unary Floating-point          |           | Sampled at Reset                     | 378 |
| Wide Multiply Matrix                   |           | Sampled per Clock                    | 37ช |
| Wide Multiply Matrix Extract           |           | Bus Access                           | 379 |
| Wide Multiply Matrix Extract Immediate |           | Other bus cycles                     |     |
| Wide Multiply Matrix Floating point    |           | Special cycles                       | 381 |
| Wide Multiply Matrix Galoss            |           | 1/O cycles                           |     |
| Wick Swach                             |           | Events and Threads                   |     |
| Wide Translate                         |           | Ephemeral Program State              |     |
| Memory Management                      |           | Event Register                       |     |
| Overview                               |           | Event Mask                           |     |
| Local Translation Pluffer              |           | Exceptions:                          | 392 |
| Global Translation Buffer              |           | Global TBMss Handler                 |     |
| GTB Repwers                            |           | Exceptions in detail                 |     |
| Address Generation                     |           | Reserved Instruction                 |     |
| Memory Banks                           |           | Access Disallowed by virtual address |     |
| Program Microcache                     |           | Access disallowed by tag             |     |
| Wide Microcache                        |           | ccess detail required by tag         |     |
| Lavel Zero Cache                       |           | Access disallowed by global TH       |     |
| Seructure                              | 331       | HT ladolg yd beriuper lieteb seech.  | 396 |

Zeus System Architecture Tue, Aug 17, 1999 Global TB miss ..... Access duallowed by local TB..... .397 Access detail required by local TB... 397 Local TB miss..... 398 Floating-point arithmetic...... 398 Power-on Reset..... ....399 Hus Reset..... Control Register Reset ..... ..400 Melidown Detected Reset..... ...400 Double Check Reset..... Machine Check .... .400 Panty or Uncorrectable Error in Cache...401 Panty or Communications Error in Bus. 401 

Event Thread Exception .....

| Reset state                       |      |
|-----------------------------------|------|
| Start Address                     | 412  |
| Internal ROM Code                 | 403  |
| Memory and Devices                | 404  |
| Physical Memory Map               | 414  |
| Architecture Description Register | 407  |
| Status Register                   |      |
| Control Register                  | 4.70 |
| Clock                             | 417  |
| Clock Cycle                       | 412  |
| Clock Event                       |      |
| Clock Watchdog                    |      |
| Tally                             |      |
| Tally Counter                     |      |
| Tally Control                     | A15  |
| Thread Register                   | 417  |
| index                             |      |
| IGC4                              |      |

Contents

•

# Tables and Figures

| frecipine notation                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             | 9                                                       |
|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|---------------------------------------------------------|
| compare-branch relations                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       | 23                                                      |
| compare-set relations                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          | 23                                                      |
| 32 but 2 way deal                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              | 36                                                      |
| 16-bit 4-way deal                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              | 17                                                      |
| 16-bit 2-way shuffle                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           | 17                                                      |
| 16 bit 4-way shuffle                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           |                                                         |
| 16-but revene                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  |                                                         |
| Compress 32 bits to 16, with 4-bit right shift                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                 | 36                                                      |
| Expand 16 bits to 32, with 4 bit left shift                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    | 36                                                      |
| selected needs                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                 |                                                         |
| temper usage                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   | 4()                                                     |
| Alignment whim do region                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       | 42                                                      |
| Caseway with y uniters to code and data space                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  | ı 45                                                    |
| canonical pipels as                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            | 49                                                      |
| canunacal pape and                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             | . 49                                                    |
| supericular p.peline                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           | 50                                                      |
| Aperpapelized pipeline                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         | . 50                                                    |
| Superv.o.4g росевое                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            | .51                                                     |
| Superior                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       | . 52                                                    |
| Superspring pipeline                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           | .52                                                     |
| Superthread paneline                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           | 61                                                      |
| Superthread papeling                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           | 53                                                      |
| Superthread pspeline                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           |                                                         |
| manor operation code field values for A.MIN(                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   | 18                                                      |
|                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                | er.                                                     |
| menor operation code field values for P VIINC                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  | 1960                                                    |
|                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                |                                                         |
| many operation and field when feet his W                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       | 1060                                                    |
| munor operation code field values for Inhii. (                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                 | )R59                                                    |
| minor operation code field values for I. Mil. 'C<br>minor operation code field values for S.MINO                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               | RS9                                                     |
| minor operation code field values for I_Aii. (C<br>minor operation code field values for S.AIINC<br>minor operation code field values for G. mie                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               | )R59<br> R60<br> .60                                    |
| minor operation code field values for IAii. 'C<br>minor operation code field values for S.AIINC<br>minor operation code field values for G. mie<br>minor operation code field values for 1881/197                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              | )R59<br>)R60<br>60                                      |
| minor operation code field values for INii. 'C<br>minor operation code field values for S.MINC<br>minor operation code field values for G.mie<br>minor operation code field values for NSI III-7<br>minor operation code field values for NSI III-7                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            | )R59<br>)R60<br>60<br>(160)                             |
| minor operation code field values for IAii. 'C<br>minor operation code field values for S.MINC<br>minor operation code field values for G. mie<br>minor operation code field values for NSI III-T<br>minor operation code field values for NSI III-T<br>minor operation code field values for E. mine                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          | )R59<br>)R60<br>60<br>(160)                             |
| minor operation code field values for IAi. (Commor operation code field values for S.MINC minor operation code field values for C. me minor operation code field values for SSI [IF] minor operation code field values for SSI [IF] minor operation code field values for F-size minor operation code field values for F-size minor operation code field values for F-size                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     | )R59<br>)R60<br>60<br>(160<br>60                        |
| manor operation code field values for L.Mi. (Commor operation code field values for S.MINC manor operation code field values for G. aue manor operation code field values for NSI III-1 manor operation code field values for NSI III-1 manor operation code field values for E-size manor operation code field values for E-size manor operation code field values for W.MINOR.B                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              | )R59<br>)R60<br>.60<br>(160<br>.60<br>.60               |
| minor operation code field values for L.A.i. ( minor operation code field values for S.MINO minor operation code field values for G.mie minor operation code field values for ESS III-I minor operation code field values for ESS III-I minor operation code field values for E-size minor operation code field values for E-size minor operation code field values for W.MINOR.L or W.MINOR.B                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                 | )R59<br>)R60<br>.60<br>(160<br>.60<br>.60               |
| minor operation code field values for L.Mi. 'C<br>minor operation code field values for S.MINO<br>minor operation code field values for G. min<br>minor operation code field values for XSLIFT<br>minor operation code field values for Esme<br>minor operation code field values for Esme<br>W.MINOR.L or W.MINOR.B                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           | )R59<br>)R60<br>.60<br>(160<br>.60<br>.60               |
| minor operation code field values for L.Mi. 'C minor operation code field values for S.MINO minor operation code field values for G.mie minor operation code field values for XSI III-I minor operation code field values for XSI III-I minor operation code field values for E-mie minor operation code field values for W.MINOR.L or W.MINOR.B                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               | )R59<br>)R60<br>.60<br> 160<br> 60<br>.60               |
| minor operation code field values for L.M.i. 'Commor operation code field values for S.MINO minor operation code field values for G.MINO minor operation code field values for XSI III-T minor operation code field values for XSI III-T minor operation code field values for E-minor operation code field values for M.MINOR.L or W.MINOR.B                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  | )R59<br>)R60<br>.60<br> 160<br> 60<br>.60               |
| minor operation code field values for LAGE, 'Commor operation code field values for SAMINO minor operation code field values for Gaussianino operation code field values for XSIMIP minor operation code field values for XSIMIP minor operation code field values for Easie minor operation code field values for Wallinoral or Wallinoral minor operation code field values for EMULA WALLAUDAL or Walling EMULANIC, EMULADDAID, EMULADDAID, EMULADDAID, EMULADDAID, EMULADDAID, ECONNICE, ECONN     | )R59<br>)R60<br>.60<br> 160<br> 60<br>.60               |
| minor operation code field values for LAGE, 'Commor operation code field values for SAMINO minor operation code field values for Gaussianinor operation code field values for ESSEPT minor operation code field values for Essea amor operation code field values for Essea amor operation code field values for Essea amor operation code field values for Wallinorla or Wallinorla or Wallinorla minor operation code field values for EMULA EMULATIO, EMULATIONI, EMULATIONI, EMULATIONI, EMULATIONI, ECONNIULA ECONNIULA ECONNIULA, ECONNIUL | )R59<br>)R60<br>.60<br>(160<br>.60<br>.61               |
| minor operation code field values for LAGE Common operation code field values for SAMINO minor operation code field values for Gaussian minor operation code field values for ESSEPT minor operation code field values for ESSEPT minor operation code field values for Essection of the end o | )R59<br>)R60<br>.60<br>(160<br>.60<br>.61               |
| minor operation code field values for LAGE Common operation code field values for SAMINO minor operation code field values for Gaussimmor operation code field values for CSEMINI minor operation code field values for XSEMINI minor operation code field values for Essue minor operation code field values for Essue minor operation code field values for WAMINOR of WAMINOR B.  EMULAID A. EMULAIN, EMULAID EMULAID EMULAID A. EMULAID A. EMULAID A. EMULAID A. ECONXIUL, ECO | )R59<br>)R60<br>.60<br>(160<br>.60<br>.61               |
| minor operation code field values for LAGE Common operation code field values for SAMINO minor operation code field values for Gaussimmor operation code field values for CSEMINI minor operation code field values for XSEMINI minor operation code field values for Essue minor operation code field values for Essue minor operation code field values for EMULA EMULAID, EMULAID, EMULAID, EMULAID, EMULAID, EMULAID, EMULAID, EMULAID, EMULAID, ECONXIUL, E         | DR59<br>PR60<br>.60<br>T60<br>.60                       |
| minor operation code field values for L.Mi. 'C minor operation code field values for S.MINO minor operation code field values for G. mise minor operation code field values for G. mise minor operation code field values for XSI III-7 minor operation code field values for Essie minor operation code field values for Essie minor operation code field values for EMULS EMULXIU, EMULXIN, EMULXIC, EMULADDXII, EMULXIN, EMULXIC, EMULADDXII, EMULADDXIU, EMULADDXIII, FMULADDXIU, ECONXIII, ECONXIUI, ECONXIUI, ECONXIIII, ECONXIUI, ECONXIUI, ECONXIIII, ECONXIUI, ECONXIUI, ECONXIIII, ECONXIUI, EEXTRACTIU WMULMATXIUI, WMULMATXIUI, WMULMATXIIII, WMULMATXIIIII, WMULMATXICI, and WMULMATXIIIII, WMULMATXICI, and WMULMATXICI.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         | )R59<br>)R60<br>(60<br>(760<br>(60<br>(61<br>(11)       |
| minor operation code field values for L.M.i. 'Commor operation code field values for S.MINO minor operation code field values for G. minor operation code field values for G. minor operation code field values for XSI IIFT minor operation code field values for Essie minor operation code field values for Essie minor operation code field values for Essie minor operation code field values for EMULAINOR.B                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             | )R59<br>)R60<br>(60<br>(760<br>(60<br>(61<br>(11)       |
| minor operation code field values for L.M.i. 'Commor operation code field values for S.MINO minor operation code field values for G. minor operation code field values for G. minor operation code field values for XSI IIFT minor operation code field values for Essie minor operation code field values for Essie minor operation code field values for Essie minor operation code field values for Emul. M.M. EMULXIU, EMULXIU, EMULXIU, EMULXIU, EMULXIU, EMULXIU, EMULADDXIU, EMULADDXIU, EMULADDXIU, ECONXIUH, EMULMATXIUH, WMULMATXIUH, WMULMATXIUH, WMULMATXIUH, WMULMATXIUH, COPETANG G.COPYL, GANG              | )R59<br>)R60<br>(60<br>(760<br>(60<br>(61<br>(11)       |
| minor operation code field values for L.M.i. 'Commor operation code field values for S.MINO minor operation code field values for G. minor operation code field values for C.SI. [1] minor operation code field values for XSI. [1] minor operation code field values for Essae minor operation code field values for Essae minor operation code field values for Essae minor operation code field values for EMUL. [2] EMULADIXIU, EMULADIXIU, EMULADIXIU, EMULADIXIU, EMULADIXIU, ECONXIUI, ECONXI             | DR59<br>PR60<br>.60<br>T60<br>.60<br>.61<br>L1,         |
| minor operation code field values for L.M.i. 'Commor operation code field values for S.MINO minor operation code field values for G. minor operation code field values for C.SI. [1] minor operation code field values for XSI. [1] minor operation code field values for Essae minor operation code field values for Essae minor operation code field values for Essae minor operation code field values for EMUL. [2] EMULADIXIU, EMULADIXIU, EMULADIXIU, EMULADIXIU, EMULADIXIU, ECONXIUI, ECONXI             | DR59<br>PR60<br>.60<br>T60<br>.60<br>.61<br>L1,         |
| minor operation code field values for L.M.I. 'Commor operation code field values for S.MINO minor operation code field values for G. minor operation code field values for G. minor operation code field values for XSLIFT minor operation code field values for Essa minor operation code field values for Essa minor operation code field values for Essa minor operation code field values for EMULX EMULAUDXIU, EMULAUDXIU, EMULAUDXIU, EMULAUDXIU, EMULAUDXIU, EMULAUDXIU, ECONXIUI, EC | DR59<br>PR60<br>.60<br>T60<br>.60<br>.61<br>L1,         |
| minor operation code field values for L.M.I. 'Commor operation code field values for S.MINO minor operation code field values for G. minor operation code field values for G. minor operation code field values for XSI IIFT minor operation code field values for Ense minor operation code field values for Ense minor operation code field values for Ense minor operation code field values for EMULA EMULATION, EMULADIONIU, EMULADIONIU, EMULADIONIU, EMULADIONIU, EMULADIONIU, ECONNIUH, ECON             | DR59<br>PR60<br>.60<br>T60<br>.60<br>.61<br>L1,         |
| minor operation code field values for LAG. 'C minor operation code field values for S.MINO minor operation code field values for G. MINO minor operation code field values for G. MINO minor operation code field values for XSTHFT minor operation code field values for E-min E-min minor operation code field values for E-min E-min Minoral of W.MINORB                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    | OR59<br>OR60<br>.60<br>T60<br>.60<br>.61<br>.61<br>.11, |
| minor operation code field values for L.M.I. 'Commor operation code field values for S.MINO minor operation code field values for G. minor operation code field values for G. minor operation code field values for XSI IIFT minor operation code field values for Ense minor operation code field values for Ense minor operation code field values for Ense minor operation code field values for EMULA EMULATION, EMULADIONIU, EMULADIONIU, EMULADIONIU, EMULADIONIU, EMULADIONIU, ECONNIUH, ECON             | OR59<br>OR60<br>.60<br>T60<br>.60<br>.61<br>.61<br>.11, |

| A.COM op and G.COM.op.size                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                |                              |
|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|------------------------------|
| Brook seems                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               | ١                            |
| District Section                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          | •                            |
| Crossbar extract                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          | 1                            |
| Crossbar merge extract                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    | •                            |
| 4-way shuffle bytes within hexlet                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         |                              |
| 4-way shuffle bytes within tricle!                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        |                              |
| Ensemble convolve extract immediate doublets2                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             | •<br>•                       |
| Ensemble convolve extract immediate complex                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               | ~                            |
| doubles                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   |                              |
| doublets 236 Einsemble convolve floating et at half little ends                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           | ,                            |
| Ensemble convolve complex floating point half                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             | 6W", ) ,                     |
| tratemine convoire complex nothing point half                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             |                              |
| little endian                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             | •                            |
| Ensemble complex multiply extract doublets, 239                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           | )                            |
| Ensemble scale add extract doublets                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       | )                            |
| Ensemble complex scale add extract doublets 24                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            | )                            |
| Ensemble extract24                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        | )                            |
| Ensemble merge extract 24                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                 | )                            |
| Ensemble multiply extract immediate doublets24                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            |                              |
| Ensemble multiply extract immediate doublets24                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            | •                            |
| Ensemble multiply extract immediate complex                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               |                              |
| doublets24                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                | 1                            |
| Fasamble multiply arrange surrollers as a least                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           |                              |
|                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           |                              |
| doublets                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  | l .                          |
| doublets 24 Ensemble multiply add extract smmediate double                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                | l<br>n253                    |
| doublets 24<br>Finsemble multiply add extract immediate double<br>Finsemble multiply add extract immediate double                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         | m?44                         |
| I'm semble multiply add extract ammediate double                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          | m?44                         |
| Ensemble multiply add extract immediate double<br>Ensemble multiply add extract immediate comple                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          | ts254<br>18                  |
| Ensemble multiply add extract immediate double<br>Ensemble multiply add extract immediate compli<br>doublets                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              | ts254<br>rs<br>i             |
| Ensemble multiply add extract immediate double<br>Ensemble multiply add extract immediate comple<br>doublets                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              | 14254<br>14<br>1<br>14       |
| Ensemble multiply add extract immediate double<br>Ensemble multiply add extract immediate compli<br>doublets                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              | 16254<br>16<br>16<br>16      |
| Ensemble multiply add extract immediate double<br>Ensemble multiply add extract immediate comple<br>doublets                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              | 16254<br>18<br>18<br>18<br>1 |
| Ensemble multiply add extract immediate double<br>Ensemble multiply add extract immediate comple<br>doublets 255<br>Ensemble multiply add extract immediate comple<br>doublets 256<br>Ensemble multiply Galvis field bytes 276<br>Wide multiply matrix 286                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                | ts254<br>rs<br>rs            |
| Ensemble multiply add extract immediate double Ensemble multiply add extract immediate comple doublets 255 Ensemble multiply add extract immediate comple doublets 256 Ensemble multiply Gal-us field bytes 276 Wide multiply matrix 256 Wide multiply matrix 256 Wide multiply matrix 256                                                                                                                                                                                                                                                                                                                                                                                                                                                                | 14254<br>14<br>14<br>14      |
| Ensemble multiply add extract immediate double Ensemble multiply add extract immediate comple doublets 255 Ensemble multiply add extract immediate comple doublets 256 Ensemble multiply Galva field bytes 257 Wide multiply matrix 258 Wide multiply matrix 258 Wide multiply matrix 258 Wide multiply extract matrix doublets 259 Wide multiply extract matrix doublets 259                                                                                                                                                                                                                                                                                                                                                                             | 14254<br>14<br>14<br>14<br>1 |
| Ensemble multiply add extract immediate double Ensemble multiply add extract immediate comple doublets                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    | 14254<br>14<br>14<br>14<br>1 |
| Ensemble multiply add extract immediate double Ensemble multiply add extract immediate comple doublets                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    | ts254                        |
| Ensemble multiply add extract immediate double Ensemble multiply add extract immediate complet doublets                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   | ts254                        |
| Ensemble multiply add extract immediate double Ensemble multiply add extract immediate complications and extract immediate complications are supported to the ensemble multiply add extract immediate complications are supported to the ensemble multiply add extract immediate complex and wide multiply matrix complex doublets.  Wide multiply extract matrix doublets are with a multiply extract matrix complex doublets. Water multiply matrix extract immediate complex are multiply matrix extract immediate complex.                                                                                                                                                                                                                            | ts254 rs i                   |
| Ensemble multiply add extract immediate double Ensemble multiply add extract immediate complet doublets                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   | ts254                        |
| Ensemble multiply add extract immediate completions between the multiply add extract immediate completions and extract immediate completions. See the multiply add extract immediate completions are seen to the multiply add extract immediate completions. See the multiply of the multiply of the multiply of the multiply of the multiply extract metric doublets. See the multiply extract metric timmediate doublets will be multiply matrix extract immediate doublets. See the multiply matrix extract immediate complex doublets. See the multiply matrix extract immediate complex doublets. See the multiply matrix extract immediate complex doublets.                                                                                        | ts254                        |
| Ensemble multiply add extract immediate completions by the multiply of the multiply of the multiply of the multiply of the multiply extract metric doublets. While multiply extract metric temples doublets while multiply matrix extract immediate doublets while multiply matrix extract immediate complex doublets. While multiply matrix extract immediate complex doublets. While multiply matrix floation; point half. While multiply matrix floation; point half.                                                                                                                            | 11254                        |
| Ensemble multiply add extract immediate completions between the multiply add extract immediate completions and immediate completions. 250 Ensemble multiply add extract immediate completions between the multiply add extract immediate completions. 250 Wide multiply matrix complex. 260 Wide multiply matrix complex. 260 Wide multiply extract matrix complex doublets. 250 Wide multiply extract matrix complex doublets. 250 Wide multiply matrix extract immediate complex doublets. 250 Wide multiply matrix extract immediate complex doublets. 250 Wide multiply matrix floatics; point half. 250 Wide multiply matrix floatics; point half. 251 Wide multiply matrix floatics; point half. 251 Wide multiply matrix floatics; point half. 251 | 11254                        |
| Ensemble multiply add extract immediate completions between the multiply add extract immediate completions and wide multiply matrix complex and wide multiply matrix complex and multiply extract matrix complex doublets. Wide multiply extract matrix complex doublets are multiply matrix extract immediate complex doublets. Wide multiply matrix extract immediate complex doublets with multiply matrix floation; point half. Wide multiply matrix floation; point half.                                                                                                       | 11254                        |
| Ensemble multiply add extract immediate completions between the multiply add extract immediate completions and the multiply matrix. The multiply matrix completions are with a multiply extract matrix completions and multiply extract matrix completions are multiply matrix extract immediate doublets. Will multiply matrix extract immediate completions are multiply matrix floations from the multiply matrix floations floating point half. Will multiply matrix completions floating point half.  Will multiply matrix Galois.                                              | m254                         |
| Ensemble multiply add extract immediate completions between the multiply add extract immediate completions and the multiply matrix complex and wide multiply matrix complex and multiply extract matrix complex doublets. Wide multiply extract matrix complex doublets. Wide multiply matrix extract immediate complex doublets has wide multiply matrix extract immediate complex doublets has wide multiply matrix floating point half with wide multiply matrix complex floating point half with wide multiply matrix. Galois her multiply matrix are regarization.              | m254                         |
| Ensemble multiply add extract immediate completions between the multiply add extract immediate completions and the multiply matrix. The multiply matrix completions are with a multiply extract matrix completions and multiply extract matrix completions are multiply matrix extract immediate doublets. Will multiply matrix extract immediate completions are multiply matrix floations from the multiply matrix floations floating point half. Will multiply matrix completions floating point half.  Will multiply matrix Galois.                                              | m254                         |

Tuc, Aug 17, 1999

Introduction

# Introduction

MicroUnity's Zeus Architecture describes general-purpose processor, memory, and interface subsystems, organized to operate at the enormously high bandwicth rates required for broadband applications.

The Zeus processor performs integer, floating point, signal processing and non-linear operations such as Galois field, table lookup and bit switching on data sizes from 1 bit to 128 bits. Group or SIMD (single instruction multiple data) operations sustain external operand handwidth rates up to 512 bits (i.e., up to four 128-bit operand groups) per instruction even on data items of small size. The processor performs ensemble operations such as convolution that maintain full intermediate precision with aggregate internal operand bandwidth rates up to 20,000 bits per instruction. The processor performs wide operations such as crossbar switch, matrix multiply and table lookup that use caches embedded in the execution units themselves to extend operands to as much as 32768 bits. All instructions produce at most a single 128-bit register result, source at most three 128-bit registers and are fixe of side effects such as the setting of condition codes and flags. The instruction set design carries the concept of streamlining beyond Reduced Instruction Set Computer (RISC) architectures, to simplify implementations that issue several instructions per machine cycle.

The Zeus memory subsystem provides 64-bit virtual and physical addressing for UNIX, Mach, and other advanced OS environments. Separate address instructions enable the division of the processor into decoupled access and execution units, to reduce the effective latency of memory to the pipeline. The Zeus cache supplies the high data and instruction issue rates of the processor, and supports coherency primitives for scaleable inultiprocessors. The memory subsystem includes mechanisms for sustaining high data rates not only in b'ock anafer modes, but also in non-unit stride and scatterred access partierns.

The Zeus interface subsystem is designed to match industry-stationard "Socket 7" protocols and pin-outs. In this way, Zeus can make use of the animense infrastructure of the PC for building low-cost systems. The interface subsystem is modular, and can be replaced with appropriate protocols and pin-outs for lower-cost and higher-performance systems.

The goal of the Zeus architecture is to integrate these processor, memory, and interface capabilities with optimal simplicity and generality. From the software perspective, the entire machine state contacts of a program counter, a single bank of 64 general-purpose 128-bit registers, and a linear byte-addressed shared memory space with mapped interface registers. All interrupts and exceptions are precise, and occur with low overhead.

This document is intended for Zeus software and hardware developers alike, and defines the interface at which their designs must meet. Zeus pursues the most efficient tradeoffs between hardware and software complexity by making all processor, memory, and interface resources directly accessible to high-level language programs.

# **Conformance**

To ensure that Zeus systems may freely interchange data, user-level programs, system-level programs and interface devices, the Zeus system architecture reaches above the processor level as hitecture.

# Manciatory and Optional Areas

A computer system conforms to the requirements of the Zeus System Architecture if and only if it implements all the specifications described in this document and other specification included by reference. Conformance to the specification is mandatory in all areas, including the instruction set, memory management system, interface devices and external interfaces, and bootstrap ROM functional requirements, except where explicit options are stated.

#### Optional areas include:

Number of processor threads
Size of first-level cache memories
Existence of a second-level cache
Size of second-level cache memory
Size of system level memory
Existence of certain optional interface device interfaces

# **Upward-compatible Modifications**

From time to time, MicroUnity may modify the architecture in an upward-compatible manner, such as by the addition of new instructions, definition of reserved bits in system state, or addition of new standard interfaces. Such modifications will be added as options, so that designs that conform to this version of the architecture will conform to future, modified versions.

Additional devices and interfaces, not covered by this standard may be added in specified regions of the physical memory space, provided that system reset places these devices and interfaces in an inactive state that does not interfere with the operation of software that runs in any conformant system. The software interface requirements of any such additional devices and interfaces must be made as widely available as this architecture specification.

# Promotion of Optional Features

It is most strongly recommended that such optional instructions, state or interfaces be implemented in all conforming designs. Such implementations enhance the value of the features in particular and the architecture as a whole by broadening the set of implementations over which software may depend upon the presence of these features.

Implementations that fail to implement these features may encounter unacceptable levels of overhead when attempting to emulate the features by exception handlers or use of virtual

心必然為強力等不多

· 我们不是是什么的,我们就是我们不是一个人,我们就是我们一个人的人。

3 4

Zeus System Architecture

Tue, Aug 17, 1999

Conformance

memory. This is a particular concern when involved in code that has real-time performance constraints.

In order that upward-compatible optional extensions of the original Zeus system architecture may be relied upon by system and application software, MicroUnity may upon occasion promote optional features to mandatory conformance for implementations designed or produced after a suitable delay upon such notification by publication of future version of the specification.

# Unrestricted Physical Implementation

Nothing in this specification should be construed to limit the implementation choices of the conforming system beyond the specific requirements stated herein. In particular, a computer system may conform to the Zeus System Architecture while employing any number of components, dissipate any amount of heat, require any special environmental facilities, or be of any physical size.

# **Draft Version**

This document is a draft version of the architectural specification. In this form, conformance to this document may not be claimed or implied. MicroUnity may change this specification at any time, in any manner, until it has been declared final. When this document has been declared final, the only changes will be to correct bugs, defects or deficiencies, and to add upward-compatible optional extensions.

# Common Elements

# **Notation**

The descriptive notation used in this document is summarized in the table below:

| x + y                                                                            | two's complement addition of x and y. Result is the same size    |
|----------------------------------------------------------------------------------|------------------------------------------------------------------|
|                                                                                  | as the operands, and operands must be of equal size.             |
| x - y                                                                            | two's complement subtraction of y from x. Result is the same     |
|                                                                                  | size as the operands, and operands must be of equal size.        |
| x * y                                                                            | two's complement multiplication of x and y. Result is the same   |
|                                                                                  | size as the operands, and operands must be of equal size.        |
| x/y                                                                              | two's complement division of x by y. Rusult is the same size     |
|                                                                                  | <u>las the operands, and operands must be of equal size.</u>     |
| x & y                                                                            | bitwise and of x and y. Result is same size as the operands,     |
|                                                                                  | and operands must be of equal size.                              |
| xly                                                                              | bitwise or of x and y. Result is same size as the operands,      |
|                                                                                  | and operands must be of equal size.                              |
| x ~ y                                                                            | bitwise exclusive-of of x and y. Result is same size as the      |
| <u></u>                                                                          | operands, and operands must be of equal size.                    |
| •x                                                                               | bitwise inversion of x. Result is same size as the operand.      |
| x = y                                                                            | two's complement equality comparison between x and y.            |
|                                                                                  | Result is a single bit, and operands must be of equal size.      |
| x ≠ y                                                                            | two's complement inequality comparison between x and y.          |
|                                                                                  | Result is a single bit, and operands must be of equal size.      |
| x <y< th=""><th>two's complement less than comparison between x and y.</th></y<> | two's complement less than comparison between x and y.           |
|                                                                                  | Result is a single bit, and operands must be of equal size.      |
| . x ≥ y                                                                          | two's complement greater than or equal comparison between        |
|                                                                                  | x and y. Result is a single bit, and operands must be of equal   |
|                                                                                  | size.                                                            |
| √×                                                                               | floating-point square root of x                                  |
| xIIy                                                                             | concatenation of bit field x to left of bit field y              |
| xV                                                                               | binary digit x repeated, concatenated y times. Size of result is |
|                                                                                  | ly.                                                              |
| Xy                                                                               | extraction of bit y (using little-endian bit numbering) from     |
|                                                                                  | value x. Result is a single bit.                                 |
| Xyz                                                                              | extraction of bit field formed from bits y through z of value    |
|                                                                                  | Ix Size of result is y-z+1; if z>y, result is an empty string.   |
| x7y:2                                                                            | value of y, if x is true, otherwise value of z. Value of x is a  |
|                                                                                  | single bit.                                                      |
| x ← y                                                                            | bitwise assignment of x to value of y                            |
| Sn                                                                               | signed, two's complement, binary data format of n bytes          |
| Un                                                                               | unsigned binary data format of n bytes                           |
| Fn                                                                               | floating-point data format of n bytes                            |
| ***************************************                                          |                                                                  |

descriptive notation

Tue, Aug 17, 1999

Common Elements

# Bit ordering

The ordering of bits in this document is always little-endian, regardless of the ordering of bytes within larger data structures. Thus, the least-significant bit of a data structure is always labeled 0 (zero), and the most-significant bit is labeled as the data structure size (in bits) minus one.

# Memory

Zeus memory is an array of 264 bytes, without a specified byte ordering, which is physically distributed among various components.



#### **Byte**

A byte is a single element of the memory array, consisting of 8 bits:



# Byte ordering

Larger data structures are constructed from the concatenation of bytes in either little-endian or big-endian byte ordering. A memory access of a data structure of size s at address i is formed from memory bytes at addresses i through 1+s-1. Unless otherwise specified, there is no specific requirement of alignment it is not generally required that i be a multiple of a Aligned accesses are preferred whenever possible, however, as they will often require one fewer processor or memory clock cycle than unaligned accesses.

With little-endian byte ordering, the bytes are arranged as:



With big-endian byte ordering, the bytes are arranged as:

| s*8-1 | 18-8 18- | 9 5.16   | 7 0                                     |
|-------|----------|----------|-----------------------------------------|
| byte  |          | byte I+1 | <br>byte Hs-1                           |
|       |          | 8        | *************************************** |

Zeus memory is byte-addressed, using either little-endian or big-endian byte ordering. For consistency with the bit ordering, and for compatibility with x86 processors, Zeus uses little-endian byte ordering when an ordering must be selected. Zeus kind and store instructions are available for both little-endian and big-endian byte ordering. The selection of byte ordering is dynamic, so that little-endian and big-endian processes, and even data structures within a process, can be intermixed on the processor.

### Memory read/load semantics

Zeus memory, includic g memory-mapped registers, must conform to the following requirements regarding side-effects of read or load operations:

A memory read must have no side-effects on the contents of the addressed memory nor on the contents of any other memory.

# Memory write/store semantics

Zeus memory, including memory-mapped registers, must conform to the following requirements regarding side-effects of read or load operations:

A memory write must affect the contents of the addressed memory so that a memory read of the addressed memory returns the value written, and so that a memory read of a portion of the addressed memory returns the appropriate portion of the value written.

A memory write may affect or cause side-effects on the conteats of memory not addressed by the write operation, however, a second memory write of the same value to the same address must have no side-effects on any memory; memory write operations must be idempotent.

Zeus store instructions that are weakly ordered may have side-effects on the contents of memory not addressed by the store itself; subsequent load instructions which are also weakly ordered may or may not return values which reflect the side-effects.

# **Data**

Zeus provides eight-byte (64-bit) virtual and physical address sizes, and eight-byte (64-bit) and sixteen-byte (128-bit) data path sizes, and uses fixed-length four-byte (32-bit) instructions. Arithmetic is performed on two's-complement or unsigned binary and ANSI/IEEE standard 754-1985 conforming binary floating-point number representations.

Tue, Aug 17, 1999

Common Elements

#### Fixed-point Data

Bit

A bit is a primitive data element:

è

#### Peck

A peck is the catenation of two bits:

Peck)

#### Nibble

A nibble is the catenation of four bits:

nibble

#### 2148

A byte is the catenation of eight bits, and is a single element of the memory array:



#### **Doublet**

A doublet is the catenation of 16 bits, and is the catenation of two bytes:



#### Quadlet

A quadlet is the extension of 32 bits, and is the extension of four bytes:



| 7.cus | System | Architectur | τ |
|-------|--------|-------------|---|
|       |        |             |   |

Tue, Aug 17, 1999

Common Elements

#### Octlet

An octlet is the extension of 64 bits, and is the extension of eight bytes:



#### Hextet

A healet is the catenation of 128 hits, and is the catenation of sixteen bytes:

| 127 |               | 96            |
|-----|---------------|---------------|
|     | hexlet127.,96 |               |
|     | Ŋ             |               |
| 95  |               | 54            |
|     | hexlet9564    |               |
|     | 32            |               |
| ۵3  |               | 32            |
|     | hexlet4332    |               |
|     | 32            |               |
| 31  |               |               |
|     | hexlet310     | Ť             |
|     | 33            | <del></del> 1 |

Tue, Aug 17, 1999

Common Elemena

#### **Iriclet**

A triclet is the catenation of 256 bits, and is the catenation of thirty-rwo bytes:

| • • • • • • • • • • • • • • • • • • • • | ,           |
|-----------------------------------------|-------------|
| 255                                     | 224         |
| triclet255224                           |             |
| 32                                      | ·           |
| •                                       |             |
| 223                                     | .92         |
| triclet223_192                          |             |
| 32                                      |             |
| 191                                     | 160         |
| triclet 191160                          | 780         |
| 32                                      |             |
|                                         |             |
| 159                                     | 128         |
| triclet 159128                          |             |
| 32                                      |             |
| 127                                     | 96          |
| triclet12796                            |             |
| 32                                      |             |
|                                         |             |
| 95                                      | 64          |
| triclet95,.64                           |             |
| 32                                      |             |
| 63                                      | 32          |
| triclet6332                             |             |
| 32                                      |             |
|                                         |             |
| 31                                      | 0           |
| triclet310                              |             |
| 32                                      | <del></del> |
|                                         |             |

#### **Address**

Zeus addresses, both virtual addresses and physical addresses, at . octlet quantities.

### Floating-point Data

Zens's floating-point formats are designed to satisfy ANSI/IEEE standard 754-1985: Binary Floating-point Arithmetic. Standard 754 leaves certain aspects to the discretion of implementers: additional precision formats, enoughing of quiet and signaling NaN values, details of production and propagation of quiet NaN values. These aspects are detailed below.

Zeus adds additional half-precision and quid-precision formats to standard 754's single-precision and double-precision formats. Zeus's double-precision satisfies standard 754's

precision requirements for a single extended format, and Zeus's quad precision satisfies standard 754's precision requirements for a double extended format.

Each precision format employs fields labeled a (sign), e (exponent), and f (fraction) to encode values that are (1) NaN: quiet and signaling, (2) infinites: (-1) ^8ao, (3) normalized numbers: (-1) ^82^1 bias(0.f), and (5) zero: (-1) ^80.

Quiet NaN values are denoted by any sign bit value, an exponent field of all one bits, and a non-zero fraction with the most rignificant bit set. Quiet NaN values generated by default exception handling of standard operations have a zero sign bit, an exponent field of all one bits, a fraction field with the most significant bit set, and all other bits cleared.

Signaling NaN values are denoted by any sign bit value, an exponent field of all one bits, and a non-zero fraction with the most significant bit cleared.

Infinite values are denoted by any sign bit value, an exponent field of all one bits, and a zero fraction field.

Normalized number values are denoted by any sign bit value, an exponent field that is not all one bits or all zero bits, and any fraction field value. The nur-eric value encoded is (-1)^a2^a-bas(1.0). The bias is equal the value resulting from setting all but the most significant bit of the exponent field, half: 15, single: 127, double: 1023, and quad: 16383.

Denormalized number values are denoted by any sign bit value, an exponent field that is all zero bits, and a non-zero fraction field value. The numeric value encoded is  $(-1)^{AB} \ge 1 - bias(0, f)$ .

Zeto values are denoted by any sign bit value, and exponent field that is all zero bits, and a fraction field that is all zero bits. The numeric value encoded is (-1)^40. The distinction between +0 and -0 is significant in some operations.

#### Half-precision Floating-coint

Zeus h. If precision uses a format similar to standard 754's requirements, reduced to a 16 bit overall format. The format contains sufficient precision and exponent range to hold a 12 bit signed integer.



9

Zeus System Architecture

Tuc, Aug 17, 1999

Common Elements

#### Single-precision Floating-point

Zeus single precision sanshes stan- ard 354's requirements for "single."



#### Couble precision Floating point

Zeus double precision satisfies standard 754's requirements for "double."



#### Quad-precision Floating-point

Zeus quad precision sanstica standard 754's requirements for "double extended." but has additional fraction precision to use 128 bits.

| 12 126 |    | 112 111 |                                       | 96 |
|--------|----|---------|---------------------------------------|----|
| \$     | •  |         | f11196                                |    |
| 1      | 15 |         | 16                                    |    |
| 95     |    |         |                                       | 64 |
|        |    | 19564   |                                       |    |
|        |    | 32      | · · · · · · · · · · · · · · · · · · · |    |
| 63     |    |         | ·<br>                                 | 32 |
|        |    | f6332   |                                       |    |
|        |    | 32      |                                       |    |
| 31     |    |         |                                       | 0  |
|        |    | f310    |                                       |    |
|        |    | 32      |                                       |    |

# Zeus Processor

Microl 'nity's Zeus processor provides the general purpose, high-bandwidth computation capability of the Zeus system. Zeus includes high-bandwidth data paths, register files, and a memory hierarchy. Zeus's memory hierarchy includes on chip instruction and data memories, instruction and data caches, a virtual memory facility, and interfaces to external devices. Zeus's interfaces in the initial implementation are solely the "Super Socket 7" bus, but other implementations may have different or additional interfaces.

# **Architectural Framework**

The Zeus architecture defines a compatible framework for a family of implementations with a range of capabilities. The following implementation-defined parameters are used in the rest of the document in boldface. The value indicated is for MicroUnity's first Zeus implementation.

| Parame<br>ter | Interpretation                                         | Value | Range of legal values |
|---------------|--------------------------------------------------------|-------|-----------------------|
|               | number of execution threads                            | 4     | 1 S T S 31            |
| CE            | log2 cache blocks in first-level cache                 |       | 0 ≤ CE ≤ 31           |
| CS            | log <sub>2</sub> cache blocks in first-level cache set | 2     | 0 ≤ CS ≤ 4            |
| CT            | existence of dedicated taos in first-level cache       | 1     | 0 ≤ C7 ≤ I            |
| LE            | logz entries in local (B                               | 0     | 0 ≤ LE ≤ 3            |
| LB            | Local TB based on base register                        | 1     | 0 ≤ LB ≤ 1            |
| GE            | log2 entries in global TB                              | 7     | 0 ≤ GE ≤ 15           |
| GT            | log2 threads which share a global TB                   | , T   | 0 ≤ GT ≤ 3            |

# Interfaces and Block Diagram

The first implementation of Zeus uses "socket ?" protocols and pinouts.

# Instruction

# Assembler Syntax

Instructions are specified to Zeus assemblers and other code tools (assemblers) in the syntax of an instruction mnemonic (operation code), then optionally white space (blanks or tabs) followed by a list of operands.

The instruction mnemonics listed in this specification are in upper case (capital) letters, assemblers accept either upper case or lower case letters in the instruction mnemonics. In

this specification, instruction mnemonics contain periods (".") to separate elements to make them easier to understand; assemblers ignore periods within instruction mnemonics. The instruction mnemonics are designed to be parsed uniquely without the separating periods.

If the instruction produces a register result, this operand is listed first. Following this operand, if there are one or more source operands, is a separator which may be a comma (","), equal ("="), or at sign ("(@")). The equal separates the result operand from the source operands, and may optionally be expressed as a comma in assembler code. The at-sign indicates that the result operand is also a source operand, and may optionally be expressed as a comma in assembler code. If the instruction specification has an equal-sign, an at-sign in assembler code indicates that the result operand should be repeated as the first source operand (for example, "A.ADD.I r4@5" is equivalent to "A.ADD.I r4=r4,5"). Commas always separate the remaining source operands.

The result and source operands are case-sensitive; upper case and lower case letters are distinct. Register operands are specified by the names r0 (or r00) through r63 (a lower case "r" immediately followed by a one or two digit number from 0 to 63), or by the special designations of "lp" for "r0," "dp" for "r1," "fp" for "r62," and "sp" for "r63." Integer-valued operands are specified by an optional sign (-) or (+) followed by a number, and assemblers generally accept a variety of integer-valued expressions.

#### Instruction Structure

A Zeus instruction is specifically defined as a four-byte structure with the little-endian ordering shown below. It is different from the quadlet defined above because the placement of instructions into memory must be independent of the byte ordering used for data structures. Instructions must be aligned on four-byte boundaries; in the diagram below, i must be a multiple of 4.

| 31       | 24 23 | 16    | 15      | 8 7 | 0  |
|----------|-------|-------|---------|-----|----|
| byte I+3 | by    | te #2 | byte H1 | byt | el |
| 8        |       | 8     | 8       |     |    |

### <u>Gateway</u>

A Zeus gateway is specifically defined as an 8-byte structure with the little-endian ordering shown below. A gateway contains a code address used to securely invoke a system call or procedure at a higher privilege level. Gateways are marked by protection information specified in the TB. Gateways must be aligned on 8-byte boundaries; in the diagram below, i must be a multiple of 8.

| 63 | 56       | 55   | 48 4 | 17       | 40 39 |         | 32 |
|----|----------|------|------|----------|-------|---------|----|
|    | byte I+7 | byte | 1+6  | byte i+5 |       | byte H4 |    |
|    | 8        | 8    |      | 8        |       | 8       |    |
| 31 | 24       | 23   | 16 1 | 15       | 8 7   |         | 9  |
|    | byte I+3 | byte | H2   | byte I+1 |       | byte I  |    |
|    | 8        | 8    |      | 8        |       | 8       |    |

The gateway contains two data items within its attructure, a code address and a new privilege level:

| 63       |              | 21 0 |
|----------|--------------|------|
|          | code address | Pi   |
| <u> </u> | 6.7          | 7    |

The virtual memory system can be used to designate a region of memory as containing gateways. Other data may be placed within the gateway region, provided that if an attempt is made to use the additional data as a gateway, that security cannot be violated. For example, 64-bit data or stack pointers which are aligned to at least 4 bytes and are in little-endian byte order have pl=0, so that the privilege level cannot be raised by attempting to use the additional data as a gateway.

# <u> Liser State</u>

The user state consists of hardware data structures that are accessible to all conventional compiled code. The Zeus user state is designed to be as regular as possible, and consists only of the general registers, the program counter, and virtual memory. There are no specialized registers for condition codes, operating modes, rounding modes, integer multiply/divide, or floating-point values.

#### **General Registers**

Zeus user state includes 64 general registers. All are identical; there is no dedicated zero-valued register, and there are no dedicated floating-point registers.

| 127 |         | 0 |
|-----|---------|---|
|     | REG[0]  |   |
|     | REG[1]  |   |
|     | REG[2]  |   |
|     | •       |   |
| Ì   | •       |   |
|     | •       |   |
|     | REG[62] |   |
|     | REG[63] |   |
|     | 128     |   |

Some Zeus instructions have 64 bit register operands. These operands are sign-extended to 128 bits when written to the register file, and the low-order 64 bits are chosen when read from the register file.

#### Definition

Tuc, Aug 17, 1999

Zeus Processor

val ← REG|m|
endcase
enddel.

in leg'Armeirn, sule, vali
case succ of
64:

REG|m| ← val<sub>63</sub>64 | 1 | val<sub>63</sub> 0

128

REG|m| ← val<sub>127-0</sub>
endcase
enddel

#### Program Counter

The program counter contains the address of the currently executing instruction. This register is implicitly manipulated by branch instructions, and read by branch instructions that save a return address in a general register.

| 63             | 2 10 |
|----------------|------|
| ProgramCounter | 0    |
| 62             | 2    |

#### Privilege Level

The privilege level register contains the privilege level of the currently executing instruction. This register is implicitly manipulated by branch gateway and branch down instructions, and read by branch gateway instructions that save a return address in a general register.

# Program Counter and Privilege Level

The program counter and privilege level may be packed into a single octlet. This combined data structure is saved by the Branch Gateway instruction and restored by the Branch Down i..struction.



# System state

The system state consists of the facilities not normally used by conventional compiled code. These facilities provide mechanisms to execute such code in a fully virtual environment. All system state is memory mapped, so that it can be manipulated by compiled code.

# Fixed-point

Neus provides load and store instructions to move data between memory and the registers, branch instructions to compare the contents of registers and to transfer control from one code address to another, and arithmetic operations to perform computation on the contents of registers, returning the result to registers.

#### Load and Store

The load and store instructions move data between memory and the registers. When loading data from memory into a register, values are zero-extended or sign-extended to fill the register. When storing data from a register into memory, values are truncated on the left to fit the specified memory region.

Lead and store instruction: that specify a memory region of more than one byte may use either little-endian or big-endian byte ordering; the size and ordering are explicitly specified in the instruction. Regions larger than one byte may be either aligned to addresses that are an even multiple of the size of the region or of unspecified alignment: alignment checking is also explicitly specified in the instruction.

Load and store instructions specify memory addresses as the sum of a base general register and the product of the size of the memory region and either an immediate value or another general register. Scaling maximizes the memory space which can be reached by immediate offsets from a single base general register, and assists in generating memory addresses within iterative loops. Alignment of the address can be reduced to checking the alignment of the first general register.

The load and store instructions are used for fixed-point data as well as floating-point and digital signal processing data; Zeus has a single bank of registers for all data types.

Swap instructions provide multithrerd and multiprocessor synchronization, using indivisible operations: add-swap, compare-swap, multiplex-swap, and double-compare-swap. A store-multiplex operation provides the ability to indivisibly write to a portion of an octlet. These instructions always operate on aligned octlet data, using either little-endian or big-endian byte ordering.

#### **Branch**

The fixed-point compare-and-branch instructions provide all arithmetic tests for equality and inequality of signed and unsigned fixed-point values. Tests are performed either between two operands contained in general registers, or on the bitwise and of two operands. Depending on the result of the compare, either a branch is taken, or not taken. A taken branch causes an immediate transfer of the program counter to the target of the branch, specified by a 12-bit signed offset from the location of the branch instruction. A non-taken branch causes no transfer, execution continues with the following instruction.

Other branch instructions provide for unconditional transfer of control to addresses too distant to be reached by a 12-bit offset, and to transfer to a target while placing the location

following the branch into a register. The branch through gateway instruction provides a secure means to access code at a higher privilege level, in a form similar to a normal procedure call.

#### Actressing Operations

A subset of general fixed-point arithmetic operations is available as addressing operations. These include add, subtract, Boolean, and simple shift operations. These addressing operations may be performed at a point in the Zeus processor pipeline so that they may be completed prior to or in conjunction with the execution of load and store operations in a "superspring" pipeline in which other arithmetic operations are deferred until the completion of load and store operations.

### **Execution Operations**

Many of the operations used for Digital Signal Processing (DSP), which are described in greater detail below, are also used for performing simple scalar operations. These operations perform anthmetic operations on values of 8-, 16-, 32-, 64-, or 128- bit sizes, which are right-aligned in registers. These execution operations include the add, subcrect, boolean and simple shift operations which are also available as addressing operations, but further extend the available set to include three-operand add/subtract, three-operand boolean, dynamic shifts, and bit-field operations.

# Floating-point

Neus provides all the facilities mandated and recommended by ANSI/IEEE standard 754-1985: Binary Floating-point Arithmetic, with the use of supporting software.

# **Branch Conditionally**

TREE PROPERTY OF THE STATE OF

The floating-point compare-and-branch instructions provide all the comparison types required and suggested by the IEEE floating-point standard. These floating-point comparisons augment the usual types of numeric value comparisons with special handling for NaN (not-a-number) values. A NaN value compares as "unortiered" with respect to any other value, even that of an identical NaN value.

Zeus floating-point compare-branch instructions do not generate an exception on comparisons involving quiet or signaling NaN values. If such exceptions are desired, they can be obtained by combining the use of a floating-point compare-set instruction, with either a floating-point compare-branch instruction on the floating-point operands or a fixed-point compare-branch on the set result.

Because the less and greater relations are anti-commutative, one of each relation that differs from another only by the replacement of an L with a G in the code can be removed by reversing the order of the operands and using the other code. Thus, an L relation can be used in place of a G relation by swapping the operands to the compare-branch or compareset instruction.

No instructions are provided that branch when the values are unordered. To accomplish such an operation, use the reverse condition to branch over an immediately following unconditional branch, or in the case of an if-then-else clause, reverse the clauses and use the reverse condition.

The E relation can be used to determine the unordered condition of a single operand by comparing the operand with itself.

The following floating point compare-branch relations are provided as instructions:

|      | Mnemonic |                | Branch taken if values compare as: |      |       |                | Exception if |  |
|------|----------|----------------|------------------------------------|------|-------|----------------|--------------|--|
| code | C-like   | Unord-<br>ered | Greater                            | Less | Equal | unord-<br>ered | invalid      |  |
| E    | 22       | F              | F                                  | F    | Ť     | no             | no           |  |
| LG   | 0        | F              | T                                  | T    | F     | no             | no           |  |
| L    | <        | F              | F                                  | Y    | F     | no             | no           |  |
| GE   | >=       | F              | T                                  | F    | Ť     | no             | no           |  |

compare-branch relations

#### Compare-set

The compare-set floating-point instructions provide all the comparison types supported as branch instructions. Zeus compare-set floating-point instructions may optionally generate an exception on comparisons involving quiet or signaling NaNs.

The following floating-point compare-set relations are provided as instructions:

| Mner | Mnemonic  |                | Result if values compare as: |      |                                                  |                | Exception if |  |
|------|-----------|----------------|------------------------------|------|--------------------------------------------------|----------------|--------------|--|
| code | C-like    | Unord-<br>ered | Greater                      | Less | Equal                                            | unord-<br>ered | invalid      |  |
| E    | ==        | F              | F                            | ۶    | 7                                                | no             | no           |  |
| LG   | 0         | F              | T                            | T    | F                                                | no             | no           |  |
| L    | <         | F              | F                            | 7    | F                                                | no             | no           |  |
| GE   | >=        | F              | 7                            | F    | T                                                | no             | no           |  |
| EX   | 22        | F              | F                            | F    | <del>                                     </del> | no             | ves          |  |
| LGX  | 0         | F              | 7                            | Ī    | F                                                | no             | yes          |  |
| LX   | <         | F              | F                            | 7    | F                                                | ves            | ves          |  |
| GE.X | <b>(=</b> | F              | T                            | F    | <del>                                     </del> | ves            | AG2          |  |

compare-set relations

# Arithmetic Operations

The basic operations supported in hardware are floating-point add, subtract, multiply, divide, square root and conversions among floating-point formats and between floating-point and binary integer formats.

Software libraries provide other operations required by the  $\lambda$ NSI/IEEE floating-point standard.

Tuc, Aug 17, 1999

Neus Processor

The operations explicitly specify the precision of the operation, and round the result (or check that the result is exact) to the specified precision at the conclusion of each operation. Each of the basic operations splits operand registers into symbols of the specified precision and performs the same operation on corresponding symbols.

In addition to the basic operations, Zeus performs a variety of operations in which one or more products are summed to each other and/or to an additional operand. The instructions include a fused muliply-add (E.MULADD.F), convolve (E.CON.F), matrix multiply (E.MULMAT.F), and scale-add (E.SCALADD.F).

The results of these operations are computed as if the multiplies are performed to infinite precision, added as if in infinite precision, then rounded only once. Consequently, these operations perform these operations with no rounding of intermediate results that would have limited the accuracy of the result

#### Rounding and exceptions

Rounding is specified within the instructions explicitly, to avoid explicit state registers for a munding mode. Similarly, the instructions explicitly specify how standard exceptions (invalid operation, division by zero, overflow, underflow and inexact) are to be handled.

When no rounding is explicitly named by the instruction (defrolt), round to nearest rounding is performed, and all floating-point exception signals cause the standard-specified default result, rather than a trap. When rounding is explicitly named by the instruction (N: nearest, Z. zero, F: those, C: ceiling), the specified rounding is performed, and floating-point exception signals other than inexact cause a floating-point exception trap. When X (exact, or exception) is specified, all floating-point exception signals cause a floating-point exception trap, including inexact.

This technique assists the Zeus processor in executing floating-point operations with greater parallelism. When default rounding and exception handling control is specified in floating-point instructions, Zeus may safely retire instructions following them, as they are guaranteed not to cause data-dependent exceptions. Similarly, floating-point instructions with N, Z, F, or C control can be guaranteed not to cause data-dependent exceptions once the operands have been examined to rule out invalid operations, division by zero, overflow or underflow exceptions. Only floating-point instructions with X control or when exceptions cannot be ruled out with N, Z, F, or C control need to avoid retiring following instructions until the final result is generated.

ANSI/IEEE standard 754-1985 specifies information to be given to trap handlers for the five floating-point exceptions. The Zeus architecture produces a precise exception, (The program counter points to the instruction that caused the exception and all register state is present) from which all the required information can be produced in software, as all source operand values and the specified operation are available.

<sup>&</sup>lt;sup>1</sup>U.S. Parent 5,812,639 describes this "Technique of incorporating floating point information into processor instructions."

ANSI/IEEE standard "54-1985 specifies a set of five "stocky exception" bits, for recording the occurrence of exceptions that are handled by default. The Zeus architecture produces a precise exception for instructions with N, Z, E, or C control for invalid operation, division by seris, overflow or underflow exceptions and with X control for all floating-point exceptions, from which corresponding stocky-exception bits can be set. Execution of the same instruction with default control will compute the default result with round-to-nearest rounding. Most compound operations not specified by the standard are not available with rounding and exception controls.

#### NaN handling

ANSI/IEEE standard 754-1985 specifies that operations involving a signaling NaN or invalid operation shall, if no trap occurs and if a floating-point result is to be delivered, deliver a quiet NaN as its result. However, it fails to specify what quiet NaN value to deliver.

Zeus operations that produce a floating-point result and do not trap on invalid operations propagate signaling NaN values from operands to results, changing the signaling NaN values to quiet NaN values by setting the most significant fraction bit and leaving the remaining bits unchanged. Other causes of invalid operations produce the default quiet NaN value, where the sign bit is zero, the exponent field is all one bits, the most significant fraction bit is set and the remaing fraction bits are zero bits. For Zeus operations that produce multiple results catenated together, signaling NaN propagation or quiet NaN production is handled separately and independently for each result symbol.

ANSI/IEEE standard 754-1985 specifies that quiet NaN values should be propagated from operand to result by the basic operations. However, it fails to specify which of several quiet NaN values to propagate when more than one operand is a quiet NaN. In addition, the standard does not clearly specify how quiet NaN should be propagated for the multiple-operation instructions provided in Zeus. The standard does not specify the quiet NaN produced as a result of an operand being a signaling NaN when invalid operation exceptions are handled by default. The standard leaves unspecified how quiet and signaling NaN values are propagated though format conversions and the absolute-value, negate and copy operations. This section specifies these aspects left unspecified by the standard.

First of all, for Zeus operations that produce multiple results catenated together, quiet and signaling NaN propagation is handled separately and independently for each result symbol. A quiet or signaling NaN value in a single symbol of an operand causes only those result symbols that are dependent on that operand symbol's value to be propagated as that quiet NaN. Multiple quiet or signaling NaN values in symbols of an operand which influence separate symbols of the result are propagated independently of each other. Any signaling NaN that is propagated has the high-order fraction bit set to convert it to a quiet NaN.

For Zeus operations in which multiple symbols among operands upon which a result symbol is dependent are quiet or signaling NaNa, a priority rule will determine which NaN is propagated. Priority shall be given to the operand that is specified by a register definition at a lower-numbered (little-endian) bit position within the instruction (rb has priority over rc, which has priority over rd). In the case of operands which are estenated from two registers, priority shall be assigned based on the register which has highest priority (lower-numbered

Tue, Aug 17, 1999

Zeus Processor

bit position within the instruction). In the case of tie (as when the E.SC./L.ADD scaling operand has two corresponding NaN values, or when a E.MUL\_CP operand has NaN values for both real and imaginary components of a value), the value which is located at a lower-numbered (little-endian) bit position within the operand is to receive priority. The identification of a NaN as quiet or signaling shall not confer any priority for selection – only the operand position, though a signaling NaN will cause an invalid operand exception.

The sign bit of NaN values propagated shall be consplemented if the instruction subtracts or negates the corresponding operand or (but not and) multiplies it by or divides it by or divides it into an operand which has the sign bit set, even if that operand is another NaN. If a NaN is both subtracted and multiplied by a negative value, the sign bit shall be propagated unchanged.

For Zeus operations that convert between two floating-point formats (INFLATE and DEFLATE), NaN values are propagated by preserving the sign and the most-significant fraction bits, except that the most-significant bit of a signalling NaN is set and (for DEFLATE) the least-significant fraction bit preserved is combined, via a logical-or of all fraction bits not preserved. All additional fraction bits (for INFLATE) are set to zero.

For Zeus operations that convert from a floating-point format to a fixed-point format (SINK), NaN values produce zero values (maximum-likelihood estimate). Infinity values produce the largest representable positive or negative fixed-point value that fits in the destination field. When exception traps are enabled, NaN or Infinity values produce a floating-point exception. Underflows do not occur in the SINK operation, they produce -1, 0 or +1, depending on rounding controls.

For absolute value, negate, or copy operations, NaN values are propagated with the sign bit cleared, complemented, or copied, respectively. Signalling NaN values cause the Invalid operation exception, propagating a quieted NaN in corresponding symbol locations (default) or an exception, as specified by the instruction.

# Floating-point functions

The following functions are defined for use within the detailed instruction definitions in the following section. In these functions an internal format represents infinite-precision floating-point values as a four-element structure consisting of (1) s (sign bit): 0 for positive, 1 for negative, (2) t (type): NORM, ZERO, SNAN, QNAN, INFINITY, (3) e (exponent), and (4) f: (fraction). The mathematical interpretation of a normal value places the binary point at the units of the fraction, adjusted by the exponent: (-1)^\$\infty\$ (2^\$\infty\$) f. The function F converts a packed IEEE floating-point value into internal format. The function PackF converts an internal format back into IEEE floating-point format, with rounding and exception control.

#### Definition

def eb ← ebris(prec) as case pref of 16:

eb ← 5

32

```
8 - 49
               cb - 11
               € + 15
     endcase
enddef
def eb - ebiasiprec) as
    cb + 0 11 lebis(prec)-1
def to - forsipreci as
     fb \leftarrow prec - 1 - eb
enddel
def a - Fiprec, all as
     as \leftarrow ai_{prec-1}
     ac - alprec-2..forsipreci
     af - airotaipreci-1.0
     if ae a lebistored then
          If al = 0 then
              AL - INFINITY
          elself allowagered-1 then
              at - SNaN
               a.e - -fbitsfored
              af - 1 11 affolisipreci-2.0
          etse
              ALL - ONAN
              a.e - forestored
              M - M
         endil
     elseif ae = 0 then
         # af = 0 then
              at ← ZERO
         cise
              AL - NORM
              a.e - 1-ebias[prec]-fbits[prec]
              M 110 - M
         endi
    else
         at - NORM
         a.e -- ae-ebias[prec]-fors[prec]
         M | | | | |
    endil
enddef
def a - DEFAULTONAN as
    as - 0
    AL - ONW
    Ae -- -1
    1 \rightarrow Ls
enddef
```

```
Zeus System Architecture
                                     Tuc, Aug 17, 1999
                                                                               Zous Processus
def a - DEFAULTSNAN as
     as - 0
     ALL - SNAN
     a.e -- -1
     a. l \leftarrow 1
enddef
del faddja,bj as faddrja,b,Nj enddef
def c - faddrja,b,roundj as
     # a.b-NORM and b.b-NORM then
          // die are alb with exponent aligned and fraction adjusted
          if a.e > b.e then
               d -- a
               e.t \leftarrow b.t
               es - bs
               e.e ← A.e
               es - bs 11 040-be
          else f a.e < b.e then
               ts \rightarrow tb
               z_6 \rightarrow z_0
               de + be
               di - ai ii obe-ae
          endil
         t.b \rightarrow t.t
          c.e ← d.e
          if d.s = e.s then
               cs ← ds
               to + tb - to
          elseif d.f.> e.f. then
               cs + ds
               cl \leftarrow dl - el
          elseif d.f c e.f then
               2.9 → 2.3
               cl - el - dl
          eise
               CS + ref
               c.t ← ZERO
     // priority is given to b operand for NaN propagation
     etself (b.t=SNAN) or (b.t=ONAN) then
          C - b
     etself Ja.b-SNANG or Ja.b-ONANG then
     elsed a.t=ZERO and b.t=ZERO then
          c.t ← ZERO
          cs - las and by or fround-f and las or by
     // NULL values are like zero, but do not combine with ZCRO to alter sign
     elseif a.b=ZERO or a.b=NULL then
          c \leftarrow b
     elseif b.b-ZERO or b.t-NULL then
     elseif a.b-INFINITY and b.b-INFINITY then
```

1 **4** . 10 . 10 10 10 10

```
fas + bs then
               c - DEFAULTSNAN // Invalid
          endif
     elseif a.b-INFINITY then
          c \leftarrow a
     elself b.b-INFINITY wen
          c \leftarrow b
     else
          assert FALSE // should have covered at the cases above
     ende
enddef
def b -- fneglaj as
     DS ← -as
     bt \leftarrow at
     be - ae
     M \leftarrow M
enddef
def fsubjably as fsubrjab.NJ enddef
def fsubrja.b.round) as faddrja.fneg(b).round) enddef
def frsub(a,b) as frsubrja,b,N) enddef
def frsubrjatbround) as faddrifnegjajtbround) enddef
def c \leftarrow fcom(a,b) as
     If fate-SNAME or fate-ONAME or folk-SNAME or folk-ONAME then
          c - U
     elseif a.t-INFINITY and b.t-INFINITY then
          f as a bs then
               C - (a.s-0) 7 G: L
               c ← E
          enail
     etself a.b-INFINITY then
          C+ 12507 GL
     elsed b.t=INFINITY then
          c ← pos=0] 7 G: L
     etself a.b-NORM and b.b-NORM then
          if as # bs then
               c ← Jas=0 ? G: L
               If ae > be then
                    Ls - b
                    M + M II Oze-De
                    af - al 11 Obe-ae
                    M - M
               endil
               d al = bl then
                    C ← E
```

```
C - 125-01 " 12 > 611 7 G : L
          end
     elsed a.b-NORM then
     c ← (a.s=0) 7 G: L
elself b.t=NORM then
          c +- (b.s=0) 7 G. L
     elsed a.t=ZERO and b.t=ZERO then
          c ← E
          assert FALSE // should have covered at the cases above
     endif
enddel
del c ← fmulia.b) as
if a.b=NORM and b.b=NORM them
          cs - as . bs
          CI - NORM
          ce - ae . be
          c.t \leftarrow a.t \cdot b.t
     // priority is given to b operand for NaN propagation etself (b.b=SNAN) or (b.b=QNAN) then
          cs - as bs
          c.t ← b.t
          c.e ← b.e
          td \rightarrow t.
     etsed (a.t=SNVM) or (a.t=ONVM) then
          cs - as " bs
          1.4 \rightarrow 1.7
          C.e +- a.e
          c1 - e1
     elsed a.t=ZERO and b.t=INFINITY then
          c - DEFAULTSNAN // Invalid
     elself a.t=INFINITY and b.t=ZERO then
          c - DEFAULTSNAN // Invalid
     elself a.b-ZERO or b.b-ZERO then
          C.5 ← a.5 * b 1
          ct - ZERO
     etse
          assert FALSE // should have covered at the cases above
     endif
enddel
del c - fdwrja,bj as
if a.t=NORM and b.t=NORM then
          cs - as bs
           C.I ← NORM
           c.e ← ae - b.e + 256
           c1 - fal 11 0254/ bl
     // priority is given to b operand for NaN propagation
      etself (b.t=SNAN) or (b.t=ONAN) then
          cs - as bs
          td \rightarrow t.
           c.e - b.e
```

```
(.f - b.f
     etself fath SNAME or fath-ONAME theri
         cs - as . ps
          16 - 13
          C.E - 4.8
          La \rightarrow L
     etself a.b-ZERO and b.b-ZERO then
     C - DEFAIRTSNAN // Invalid elself a.b./AFINITY and b.b./AFINITY then
         < - DEFAULTSNAN // Invalid
     rised a b-ZERO then
         cs - as - bs
          CI - ZERO
     cised a. INFINITY then
         cs + es , ps
          CI - INFINITY
          assert FALSE // should have covered at the cases above
cnddel
del msb - findmsblat as
     MAXF - 218 // Largest possible I value after matrix multiply
     for 1 - 0 to MAGE
          # AMAGE-1.4 = (0MANE-1-1 11 1) then
               mst - j
          endf
     endior
enddef
def at - Pacifiprecarounce as
     case at of
          NORW
               msb - findmsb(a.f)
               m - msb-1-foits(prec) // isb for normal
               rdn ← -ebias[prec]-a.e-1-fbits[prec] // lsb if a denormal
               rb + (m > rdn) ? m : rdn
               of rb ≤ 0 then
                    afr - a.fmsb-1.0 11 0-rb
                    eady - 0
               etse
                    case round of
                         C:
                              s -- Omsb-rb | | |-a.strb
                              s - Omst-rb | | Jasyrb
                         N. NONE:
                              s - 0msb-rb 11 - 1/10 11 a/18-1
                         X
                              d alm-1.0 = 0 then
                                   raise FloatingPointArthmetic // Inexact
                              endif
                              s - 0
```

deldel

```
endcase
                pilo . lo.contalio - v
                If vindo = 1 then
                     aft - Vmsb-1.rb
                     cadj + 0
                     adt ... Ofonstored
                     eads - 1
               ende
           endif
           aren - a.e + msb - 1 + eady + rousspred
           # aven $ 0 then
               if round . NONE then
                    M -- AS 11 OPDISSPRED 11 ANT
                    raise FloatingPointArithmetic ,"/Underflow
           elself aren 2 l'ebits(prec) then
               f round . NONE then
                    //default: round-to-nearest overflow handling
                    progrado 11 perdendes 11 ca - la
                    raise FloatingPointAnthmetic //Underflow
          etre
               a \leftarrow as 11 arenebis[prec-1... 11 afr
          endé
          if round # NONE then
               raise FloatingPointAnthmetic //Invalid
          if - a.e < fibits(prec) then
              as -- as 11 leps/back 11 af-re-1"0 11 Open/back-re-
              isb - af-ae-1-basipreci-1.0 = 0
                 - as 11 leptablect 11 al-e-1-re-1-bestrect-2 11 pp
          endif
          # -a.e < fbits(prec) then
              as - as || | ebis(prec) || af-a-1.0 || Ofbis(prec)-ae
              Isb -- a.f.a.e-1-forsipreci-1..0 = 0
              endif
    ZERO
         at - a.s | | Qebits[prec] | | Ofbits[prec]
         at - as 11 lebis(prec) 11 Ofbis(prec)
endcase
```

```
def as -- fsinkriprec, a, round) as
    case at of
          NORM:
               msb ← findmsb(a.f)
               rb ← -a.e
               if rb ≤ 0 then
                    afr - a.fmso.o 11 0-rb
                    aims - msb - rb
               etse
                    case round of
                         C, C.D.
                               s - 0msb-rb | | (-ai.sprb
                         F. F.D.
                               s - 0-msb-rb | | fai.sprb
                         N. NONE:
                               s - omsb-rb 11 -ai.frb 11 ai.frb-1
                         X:
                               # all r_{0-1.0} = 0 then
                                    raise FloatingFointAnthmetic // Inexact
                               endif
                               s - 0
                          Z. Z.D:
                               s - 2
                    endcase
                     v ← (Olla.fmsb..o) + (Olls)
                     if v_{msb} = 1 then
                          aims ← msb + 1 - rb
                          aims ← msb · rb
                     endif
                     aifr ← Vanns rb
               endif
               if aims > prec then
                     case round of
                          C.D, F.D, NONE, Z.D:
                               at - a.s 11 (-asprec-1
                          C. F. N. X. Z:
                               raise FloatingPointArithmetic // Overflow
                     endcase
               elself a.s = 0 then
                     as - afr
                else
                     aı ← -aıfr
               endil
          ZERO:
                as - Oprec
           SNAN, ONAN:
                case round of
                     C.D. F.D. NONE, Z.D.
                     al ← 00°C
C, F, N, X, Z
                         raise FloatingPointArithmetic // Invalid
                endcase
           INFINITY:
```

THE STATE OF THE S

```
case round of
                    C.D. F.D. NONE, Z.D.
                         at -- as 11 (-asprec-)
                    C. F. N. X Z:
                         raise FloatingPointAnthmetic // Invalid
               endcase
      endcase
 endsel
 del c - frecrestial as
      0.1 - 0
      DI - NORM
      De - 0
     DI -- 1
     c - festilidado all
enddel
def c - fragrestial as
     D.5 ← 0
     bt - NORM
     be - 0
     DI - 1
     c - lessifisarifamio.all
enddel
def c -- festial as
     # 12 12 NORM then
          msb - Anomsbla A
          a.e ← ae + msb - 13
          af - almo msb-12 11 1
     che
          C - 3
     endel
enddel
def c - fsqrpaj as
     of Jas-NORM and Jas-Of then
         c.s ← 0
         CI - NORM
          # (2.00 = 1) then
              ce - 4-1271/2
              cf - sqr[at 11 0127]
          etse
              Ce ← [a.c.128] / 2
              c1 - sortal 11 0126
    etical ja bi-SNAM) or ja bi-ONAM) or a bi-ZERO or (ja bi-INFINITY) and ja si-Oly vien
    elsed (is t-NORM) or (a.t-INFINITY)) and (a.t-1) then
         C - DEFAULTSNAN / Invalid
         assert FALSE // should have covered at the cases above
enddel
```

## Digital Signal Processing

The Zeus processor provides a set of operations that maintain the fullest possible use of 128-bit data paths when operating on lower-precision fixed-point or floating-point vector values. These operations are useful for several application areas, including digital signal processing, image processing and synthetic graphics. The basic goal of these operations is to accelerate the performance of algorithms that exhibit the following characteristics:

#### Low-precision arithmetic

The operands and intermediate results are fixed-point values represented in no greater than 64 bit precision. For floating-point arithmetic, operands and intermediate results are of 16, 32, or 64 bit precision.

The fixed-point arithmetic operations include add, subtract, multiply, divide, shifts, and set on compare.

The use of fixed-point arithmetic permits various forms of operation reordering that are not permitted in floating-point arithmetic. Specifically, commutativity and associativity, and distribution identities can be used to reorder operations. Compilers can evaluate operations to determine what intermediate precision is required to get the specified arithmetic result.

Zeus supports several levels of precision, as well as operations to convert between these different levels. These precision levels are always powers of two, and are explicitly specified in the operation code.

When specified, add, subtract, and shift operations may cause a fixed-point arithmetic exception to occur on resulting conditions such as signed or unsigned overflow. The fixed-point arithmetic exception may also be invoked upon a signed or unsigned comparison.

#### Sequential access to data

The algorithms are or can be expressed as operations on sequentially ordered items in memory. Scatter-gather memory access or sparse-matrix techniques are not required.

Where an index variable is used with a multiplier, such multipliers must be powers of two. When the index is of the form: nx+k, the value of n must be a power of two, and the values referenced should have k include the majority of values in the range 0..n-1. A negative multiplier may also be used.

#### Vectorizable operations

The operations performed on these sequentially ordered items are identical and independent. Conditional operations are either rewritten to use Boolean variables or masking, or the compiler is permitted to convert the code into such a form.

Tue, Aug 17, 1979

Zeus Processor

#### **Data-handling Operations**

The characteristics of these algorithms include sequential access to data, which permit the use of the normal load and store operations to reference the data. Octier and hexlet loads and stores reference several sequential items of data, the number depending on the operand precision.

The discussion of these operations is independent of byte ordering, though the ordering of bit fields within octlets and hexlets must be consistent with the ordering used for bytes. Specifically, if big-endian byte ordering is used for the loads and stores, the figures below should assume that index values increase from left to right, and for little-endian byte ordering, the index values increase from right to left. For this reason, the figures indicate different index values with different shades, rather than numbering.

When an index of the nx+k form is used in array operands, where n is a power of 2, data memory sequentially knowed contains elements useful for separate operands. The "shuffle" instruction divides a triclet of data up into two hexlets, with alternate bit fields of the source triclet grouped together into the two results. An immediate field, h, in the instruction specifies which of the two regrouped hexlets to select for the result. For example, two X.S.HUFFLE.256 rd=rc,rb,32,128,h operations rearrange the source triclet (c,b) into two hexlets as follows:



In the shuffle operation, two hexlet registers specify the source triclet, and one of the two result hexlets are specified as hexlet register.

The example above directly applies to the case where n is 2. When n is larger, shuffle operations can be used to further subdivide the sequential stream. For example, when n is 4, we need to deal out 4 sets of doublet operands, as shown in the figure below:



When an array result of computation is accessed with an index of the form nx+k, for n a power of 2, the reverse of the "deal" operation needs to be performed on vectors of results to interleave them for storage in sequential order. The "shuffle" operation interleaves the bit fields of two octlets of results into a single hexlet. For example a X.SHUFFLE.16 operation combines two octlets of doublet fields into a hexlet as follows:



For larger values of n, a series of shuffle operations can be used to combine additional sets of fields, similarly to the mechanism used for the deal operations. For example, when n is 4, we need to shuffle up 4 sets of doublet operands, as shown in the figure below:



When the index of a source array operand or a destination array result is negated, or in other words, if of the form nx+k where n is negative, the elements of the array must be arranged

<sup>&</sup>lt;sup>2</sup>An example of the use of a four-way deal can be found in the appendix: Digital Signal Processing Applications: Conversion of Color to Monochrome

An example of the use of a four-way shuffle can be found in the appendix: Digital Signal Processing Applications: Conversion of Monochrome to Color

in reverse order. The "swizzle" operation can reverse the order of the bit fields in a hexlet. For example, a X.SWIZZLE rd=rc,127,112 operation reverses the doublets within a hexlet.



In some cases, it is desirable to use a group instruction in which one or more operands is a single value, not an array. The "swizzle" operation can also copy operands to multiple locations within a hexlet. For example, a X.SWTZZLE 15,0 operation copies the low-order 16 bits to each double within a hexlet.

Variations of the deal and shuffle operations are also useful for converting from one precision to another. This may be required if one operand is represented in a different precision than another operand or the result, or if computation must be performed with intermediate precision greater than that of the operands, such as when using an integer multiply.

When converting from a higher precision to a lower precision, specifically when halving the precision of a hexlet of bit fields, half of the data must be discarded, and the bit fields packed together. The "compress" operation is a variant of the "deal" operation, in which the operand is a hexlet, and the result is an octlet. An arbitrary half-sized sub-field of each bit field can be selected to appear in the result. For example, a selection of bits 19..4 of each quadlet in a hexlet is performed by the X.COMPRESS rd=rc,16,4 operation:



When converting from lower precision to higher-precision, specifically when doubling the precision of an octlet of hit fields, one of several techniques can be used, either multiply, expand, or shuffle. Each has certain useful properties. In the discussion below, m is the precision of the source operand.

The multiply operation, described in detail below, automati, ally doubles the precision of the result, so multiplication by a constant vector will simultaneously double the precision of the operand and multiply by a constant that can be represented in m bits.

An operand can be doubled in precision and shifted left with the 'expand' operation, which is essentially the reverse of the "compress" operation. For example the X.T.XPAND rd=rc,16,4 expands from 16 bits to 32, and shifts 4 bits left:

12.1 . . . . . . . .



The "shuffle" operation can double the precision of an operand and multiply it by 1 (unsigned only),  $2^m$  or  $2^m+1$ , by specifying the sources of the shuffle operation to be a zerood register and the source operand, the source operand and zero, or both to be the source operand. When multiplying by 2m, a constant can be freely added to the source operand by specifying the constant as the right operand to the shuffle.

### **Arithmetic Operations**

The characteristics of the algorithms that affect the arithmetic operations most directly are low-precision arithmetic, and vectorizable operations. The fixed-point arithmetic operations provided are most of the functions provided in the standard integer unit, except for those that check conditions. These functions include add, subtract, bitwise Boolean operations, shift, set on condition, and multiply, in forms that take packed sets of bit fields of a specified size as operands. The floating-point arithmetic operations provided are as complete as the scalar floating-point arithmetic set. The result is generally a packed set of bit fields of the same size as the operands, except that the fixed-point multiply function intrinsically doubles the precision of the bit field.

Conditional operations are provided only in the sense that the set on condition operations can be used to construct bit masks that can select between alternate vector expressions, using the bitwise Boolean operations. All instructions operate over the entire octlet or healet operands, and produce a healet result. The sizes of the bit fields supported are always powers of two.

#### Galois Field Operations

Zeus provides a general software solorion to the most common operations required for Galois Field arithmetic. The instructions trovided include a polynomial multiply, with the polynomial specified as one register operand. This instruction can be used to perform CRC generation and checking, Reed-Solomon code generation and checking, and spread-spectrum encoding and decoding.

## Software Conventions

The following section describes software conventions that are to be employed at software module boundaries, in order to permit the combination of separately compiled code and to provide standard interfaces between application, histary and system software. Register usage and procedure call conventions may be modified, simplified or optimized when a single compilation encloses procedures within a compilation unit to that the procedures have no external interfaces. For example, internal procedures may permit a greater number of register-passed parameters, or have registers allocated to avoid the need to save registers at procedure boundaries, or may use a single stack or data pointer allocation to suffice for more than one level of procedure call.

#### Register Usage

All Zeus registers are identical and general-purpose; there is no dedicated zero-valued register, and no dedicated floating-point registers. However, some procedure-call-oriented instructions imply usage of registers zero (0) and one (1) in a manner consistent with the conventions described below. By software convention, the non-specific general registers are used in more specific ways.

| register<br>number | assembler<br>names | usage         | how saved |
|--------------------|--------------------|---------------|-----------|
| 0                  | lip, rO            | link pointer  | caller    |
| 1                  | dp, r1             | data pointer  | caller    |
| 2-9                | r2-r9              | paramiters :  | caller    |
| 10-31              | r10-r31            | temporary     | caller    |
| 32-61              | r32-r61            | saved         | callee    |
| 62                 | fp. 162            | frame pointer | callee    |
| 63                 | sp, r63            | stack pointer | callee    |

register usage

At a procedure call boundary, registers are saved either by the caller or callee procedure, which provides a mechanism for leaf procedures to avoid needing to save registers. Compilers may choose to allocate variables into caller or callee saved registers depending on how their lifetimes overlap with procedure calls.

### Procedure Calling Conventions

Procedure parameters are normally allocated in registers, starting from register 2 up to register 9. These registers hold up to 8 parameters, which may each be of any size from one byte to sixteen bytes (hexlet), including floating-point and small structure parameters. Additional parameters are passed in memory, allocated on the stack. For C procedures which use varangs, h or stdarg, h and pass parameters to further procedures, the compilers must leave room in the stack memory allocation to save registers 2 through 9 into memory contiguously with the additional stack memory parameters, so that procedures such as adopted can refer to the parameters as an array.

Procedure return values are also allocated in registers, starting from register 2 up to register. 9 Larger values are passed in memory, allocated on the stack.

There are several pointers maintained in registers for the procedure calling conventions: lp, sp, dp, fp.

The lp register contains the address to which the callec should return to at the conclusion of the procedure. If the procedure is also a caller, the lp register will need to be saved on the stack, once, before any procedure call, and restored, once, after all procedure calls. The procedure returns with a branch instruction, specifying the lp register.

The sp register is used to form addresses to save parameter and other registers, maintain local variables, i.e., data that is allocated as a LIFO stack. For procedures that require a stack, normally a single allocation is performed, which allocates space for input parameters, local variables, saved registers, and output parameters all at once. The sp register is always hexlet aligned.

The dp register is used to address pointers, literals and static variables for the procedure. The dp register points to a small (approximately 40%-entry) array of pointers, literals, and statically-allocated variables, which is used locally to the procedure. The uses of the dp register are similar to the use of the gp register on a Mips R-series processor, except that each procedure may have a different value, which expands the space addressable by small offsets from this pointer. This is an important distinction, as the offset field of Zeus load and store instructions are only 12 bits. The compiler may use additional registers and/or indirect pointers to address larger regions for a single procedure. The compiler may also share a single dp register value between procedures which are compiled as a single unit (including procedures which are externally callable), eliminating the need to save, modify and restore the dp register for calls between procedures which share the same dp register value.

Load- and store- immediate-aligned instructions, specifying the dp register as the base register, are generally used to obtain values from the dp region. These instructions shift the immediate value by the logarithm of the size of the operand, so loads and stores of large operands may reach farther from the dp register than of small operands. The size of the addressable region is maximized if the elements to be placed in the dp region are sorted according to size, with the smallest elements placed closest to the dp base. At points where the size changes, appropriate padding is added to keep elements aligned to memory boundaries matching the size of the elements. Using this technique, the maximum size of the dp region is always at least 40% items, and may be larger when the dp area is composed of a mixture of data sizes.



The dp register mechanism also permits code to be shared, with each static instance of the dp region assigned to a different address in memory. In conjunction with position-independent or pe-relative branches, this allows library code to be dynamically relocated and shared between processes.

To implement an inter-module (separately compiled) procedure call, the lp register is loaded with the entry point of the procedure, and the dp register is loaded with the value of the dp register required for the procedure. These two values are located adjacent to each other as a pair of octlet quantities in the dp region for the calling procedure. For a statically-linked inter-module procedure call, the linker fills in the values at link time. However, this mechanism also provides for dynamic linking, by initially filling in the lp and dp fields in the data structure to invoke the dynamic linker. The dynamic linker can use the contents of the lp and/or dp registers to determine the identity of the caller and callee, to find the location to fill in the pointers and resume execution. Specifically, the lp value is initially set to point to an entry point in the dynamic linker, and the dp value is set to poin,  $\alpha$  itself: the location of the lp and dp values in the dp region of the calling procedure. The identity of the procedure

can be discovered from a string following the dp pointer, or a separate table, indexed by the dp pointer.

The fp register is used to address the stack frame when the stack size varies during execution of a procedure, such as when using the GNU C alloca function. When the stack size can be determined at compile time, the ap register is used to address the stack frame and the fp register may be used for any other general purpose as a callee-saved register.

#### Typical static-linked, intra-module calling sequence:

```
caller (non-leaf).
caller:
           AADDI
                                 spe-size
                                                 // allocate caller stack frame
           SI.64A
                                 ip.sp.off
                                                 // save original Ip register
             (callee using same do as caller
           BLINKI
                                callee
             (callee using same dp as caller)
           BLINKI
                                callee
           L164.A
                                 10.q2=q4
                                                 // restore original lp register
           AADDI
                                 وبتو الأبع
                                                 // deallocate caller stack frame
                                                 // return
callee (leaf):
             (code using dp)
                                                 // return
```

Procedures that are compiled together may share a common data region, in which case there is no need to save, lead, and restore the dp region in the callee, assuming that the callee does not modify the dp register. The pe-relative addressing of the B.LINK.I instruction permits the code region to be position-independent.

### Minimum static-linked, intra-module calling sequence:

```
caller (non-leaf):
```

```
caller: ACOPY r31=lp // save original lp register
... (callee using same dp as caller)
BLUNICI callee
... (callee using same dp as "aller)
BLUNICI callee
B r31 // return

callee (lexf):
callee: ... (code using dp, r31 unused)
B p // return
```

When all the callee procedures are intra-module, the stack frame may also be eliminated from the caller procedure by using "temporary" caller save registers not utilized by the callee leaf procedures. In addition to the lp value indicated above, this usage may include other values and variables that live in the caller procedure across callee procedure calls.

Typical dynamic-linked, inter-module calling sequence:

```
Zeus System Architecture
```

```
Tuc, Aug 17, 1999
```

Zous Processor

```
caller (non-leaf):
caller:
          AADDI
                                 200-312e
                                                  // allocate caller stack frame
          S.I.64.A
                                 Ip.sp.off
                                                  // save original ip reguter
          S.1.64.A
                                 dp.sp.off
                                                 // save original dp register
             (code using dp)
                                 tp=dp.off
           L.1.64.A
                                                 // load to
                                 dp=dp.off
          L.1.64.A
                                                 // load dp
           B.LINK
                                 Ip-Ip
                                                  // invoke callee procedure
                                 dp=sp.off
          L.1.64.A
                                                  // restore dp register from stack
           ... kode using dpl
           L.1.64.A
                                 lo-so off
                                                  // restore original ip register
           AADDI
                                 SD=SIZe
                                                  // deallocate caller stack frame
                                                  // return
callee (leaf):
callee:
            .. kode using dpl
                                                  // return
```

The load instruction is required in the caller following the procedure call to restore the dp register. A second load instruction also restores the lp register, which may be located at any point between the last procedure call and the branch instruction which returns from the procedure.

#### System and Privileged Library Calls

It is an objective to make calls to system facilities and privileged libraries as similar as possible to normal procedure calls as described above. Rather than invoke system calls as an exception, which involves significant latency and complication, we prefer to use a modified procedure call in which the process privilege level is quietly raised to the required level. To provide this mechanism safely, interaction with the virtual memory system is required.

Such a procedure must not be entered from anywhere other than its legismate entry point, to prohibit entering a procedure after the point at which security checks are performed or with invalid register contents, otherwise the access to a higher privilege level can lead to a security violation. In addition, the procedure generally must have access to memory data, for which addresses must be produced by the privileged code. To facilitate generating these addresses, the branch-gateway instruction allows the privileged code procedure to rely the fact that a single register has been verified to contain a pointer to a valid memory region.

The branch-gateway instruction ensures both that the procedure is invoked at a proper entry point, and that other registers such as the dam pointer and stack pointer can be properly set. To ensure this, the branch-gateway instruction retrieves a "gateway" directly from the protected virtual memory space. The gateway contains the virtual address of the entry point of the procedure and the target privilege level. A gateway can only exist in regions of the virtual address space designated to contain them, and can only be used to access privilege levels at or below the privilege level at which the memory region can be written to ensure that a gateway cannot be forged.

The branch-gateway instruction ensures that register 1 (dp) contains a valid pointer to the gateway for this target code address by comparing the contents of register 0 (lp) against the

gateway retrieved from memory and causing an exception trap if they do not match. By ensuring that register 1 points to the gateway, auxiliary information, such as the data pointer and stack pointer can be set by loading values located by the contents of register 1. For example, the eight bytes following the gateway may be used as a pointer to a data region for the procedure.

Long May May Been

Before executing the branch-gateway instruction, register 1 must be set to point at the gateway, and register 0 must be set to the address of the target code address plus the desired privilege level. A "L1.64.LA r0=r1.0" instruction is one way to set register 0, if register 1 has already been set, but any means of getting the correct value into register 0 is permissible.



Similarly, a return from a system or privileged routine involves a reduction of privilege. This need not be carefully controlled by architectural facilities, so a procedure may freely branch to a less-privileged code address. Normally, such a procedure restores the stack frame, then uses the branch-down instruction to return.

.

Zeus System Architecture

Tue, Aug 17, 1999

Zeus Processos

#### Typical dynamic-linked, inter-gateway calling sequence:

```
caller:
caller:
           AADDI
                                 sp@-size
                                                 // allocate caller stack frame
           A. P.S. LZ
                                 Moderal
           SJ.64.A
                                dp.sp.off
           LIMA
                                10-dp.017
                                                 // load to
           LI.64 A
                                dp-dp.off
                                                 // load dp
           B.GATE
           LI.64 A
                                dp.sp.of
           .. kode using dpl
           LI.64A
                                ip=sp.off
                                                 // restore original lp register
          AADDI
                                90=9176
                                                 // deallocate caller stack frame
                                                 // return
callee (non-leaf);
calee.
          LI.64 A
                                dp-dp.off
                                                 // load dp with data pointer
          SL64A
                                SD. OPLOT
           LI.64 A
                                SD-OP.Off
                                                // new stack pointer
           A.PALZ
                                10.50.0ff
          S.1.64.A
                                douglado
            . Justing dpj
          LJ.64.A
                                daspoff
           .. | kode using dal
          LI.64A
                                lp=sp,of7
                                                 // restore original lp register
          LI64A
                                spess off
                                                 // restore original sp register
          B.DOWN
callee (leaf, no stack):
callee:
            . Justing dol
          B.DOWN
```

It can be observed that the calling sequence is identical to that of the inter-module calling sequence shown above, except for the use of the B.GATE instruction instead of a B.LINK instruction. Indeed, if a B.GATE instruction is used when the privilege level in the lp register is not higher than the current privilege level, the B.GATE instruction performs an identical function to a B.LINK.

The callee, if it uses a stack for local variable allocation, cannot necessarily trust the value of the sp passed to it, as it can be forged. Similarly, any pointers which the callee provides should not be used directly unless it they are verified to point to regions which the callee should be permitted to address. This can be avoided by defining application programming interfaces (APIs) in which all values are passed and returned in registers, or by using a trusted, intermediate privilege wrapper routine to pass and return parameters. The method described below can also be used.

It can be useful to have highly privileged code call less-privileged routines. For example, a user may request that errors in a privileged routine be reported by invoking a user-supplied error-logging routine. To invoke the procedure, the privilege can be reduced via the branch-down instruction. The return from the procedure actually requires an increase in privilege, which must be carefully controlled. This is dealt with by placing the procedure call within a lower-privilege procedure wrapper, which uses the branch-gateway instruction to return to

the higher privilege region after the call through a secure .e-entry point. Special care must be taken to ensure that the less-privileged routine is not permitted to gain unauthorized access by corruption of the stack or saved registers, such as by saving all registers and setting up a new stack frame (or restoring the original lower-privilege stack) that may be manipulated by the less-privileged routine. Finally, such a technique is vulnerable to an unprivileged routine attempting to use the re-entry point directly, so it may be appropriate to keep a privileged state variable which controls permission to enter at the re-entry point.

### Instruction Scheduling

The next section describes detailed pipeline organization for Zeus, which has a significant influence on instruction scheduling. Here we will elaborate some general rules for effective scheduling by a compiler. Specific information on numbers of functional units, functional unit parallelism and latency is quite implementation-dependent, values indicated her are valid for Zeus's first implementation.

#### Separate Addressing from Execution

Zeus has separate function units to perform addressing operations (A, L, S, B instructions) from execution operations (G, X, E, W instructions). When possible, Zeus will execute all the addressing operations of an instruction stream, deferring execution of the execution operations until dependent load instructions are completed. Thus, the latency of the memory system is hidden, so long as addressing operations themselves do not need to wait for memory.

#### Software Pipeline

Instructions should generally be scheduled so that previous operations can be completed at the time of issue. When this is not possible, the processor inserts sufficient empty cycles to perform the instructions precisely - explicit no-operation instructions are not required.

#### Multiple Issue

Zeus can issue up to two addressing operations and up to two execution operations per cycle per thread. Considering functional unit parallelism, described below, as many of four instruction issues per cycle are possible per thread.

#### Functional Unit parallelism

Zeus has separate function units for several classes of execution operations. An A unit performs scalar add, subtract, boolean, and shift-add operations for addressing and branch calculations. The remaining functional units are execution resources, which perform operations subsequent to memory loads and which operate on values in a parallel, partitioned form. A G unit performs add, subtract, boolean, and shift-add operations. An X unit performs general shift operations. An E unit performs multiply and floating-point operations. A T unit performs table-kook-up operations.

Tue, Aug 17, 1999

Zus Processor

Each instruction uses one or more of these units, according to the table below.

| Instruction | Α | G | X | E | • T |
|-------------|---|---|---|---|-----|
| A           | Х |   |   |   |     |
| В           | Х |   |   |   |     |
| L           | Х |   |   |   |     |
| S           | X |   |   |   |     |
| G           |   | Х |   |   |     |
| X           |   |   | Х |   |     |
| E           |   |   | X | X |     |
| W.TRANSLATE | X |   |   |   | X   |
| W.MULMAT    | X |   | X | X |     |
| W.SWITCH    | X |   | Х |   |     |

#### Latency

The latency of each functional unit depends on what operation is performed in the unit, and where the result is used. The aggressive nature of the pipeline makes it difficult to characterize the latency of each operation with a single number. Because the addressing unit is decoupled from the execution unit, the latency of load operations is generally hidden, unles the result of a load instruction must be returned to the addressing unit. Store instructions must be able to compute the address to which the data is to be stored in the addressing unit, but the data will not be irrevocably stored until the data is available and it is valid to retire the store instruction. However, under certain conditions, data may be forwarded from a store instruction to subsequent load instructions, once the data is available.

The latency of each of these units, for the initial Zeus implementation is indicated below:

| Unit | instruction | Latency rules                                                                                                                                 |
|------|-------------|-----------------------------------------------------------------------------------------------------------------------------------------------|
|      | ٨           | 1 cycle                                                                                                                                       |
|      | L           | Address operands must be ready to issue,<br>4 cycles to A unit, 0 to G, X, E, T units                                                         |
|      | S           | Address operands must be ready to issue,<br>Store occurs when data is ready and instruction<br>may be retired.                                |
|      | 8           | Conditional branch operands may be provided from the A unit (64-bit values), or the G unit (128-bit values). 4 cycles for mispredicted branch |
|      | W           | Address operand must be ready to issue,                                                                                                       |
| G    | G           | 1 cycle                                                                                                                                       |
| X    | X W.SWITCH  | 1 cycle for data operands, 2 cycles for shift amount or control operand                                                                       |
| Ε    | E, W.MULMAT | 4 cycles                                                                                                                                      |
| T    | W.TRANSLATE | 1 cycles                                                                                                                                      |

7

ln

<u>S</u>L

**7**x

រោះ

F,

P4

pn th

## Pipeline Organization

Zeus performs all instructions ar if executed one-by-one, in-order, with precise exceptions always available. Consequently, ende that ignores the subsequent discussion of Zeus pipeline implementations will still perform correctly. However, the highest performance of the Zeus processor is achieved only by matching the ordering of instructions to the characteristics of the pipeline. In the following discussion, the general characteristics of all Zeus implementations precede discussion of specific choices for specific implementations.

### Classical Pipeline Structures

Pipelining in general refers to hardware structures that overlap various stages of execution of a series of instructions so that the time required to perform the series of instructions is less than the sum of the times required to perform each of the instructions separately. Additionally, pipelines carry to connotation of a collection of hardware structures which have a simple ordering and where each structure performs a specialized function.

The diagram below shows the timing of what has become a canonical pipeline structure for a simple RISC processor, with time on the horizontal axis increasing to the right, and successive instructions on the vertical axis going downward. The stages I, R, E, M, and W refer to units which perform instruction fetch, register file fetch, execution, data memory fetch, and register file write. The stages are aligned so that the result of the execution of an instruction may be used as the source of the execution of an immediately following instruction, as seen by the fact that the end of an E stage (bold in line 1) lines up with the beginning of the E stage (bold in line 2) immediately below. Also, it can be seen that the result of a load operation executing in stages E and M (bold in line 3) is not available in the immediately following instruction (line 4), but may be used two cycles later (line 5); this is the cause of the load delay slot seen on some RISC processors.

| · [ | 1 | R | E | M | iw |   |   |   |   |
|-----|---|---|---|---|----|---|---|---|---|
| 2   |   |   | R | E | M  | W | 1 |   |   |
| 3   |   |   |   | R | E  | M | W | ] |   |
| •   |   |   |   | 1 | R  | E | М | W | } |
| 5   |   |   |   | , |    | R | E | M | W |

In the diagrams below, we simplify the diagrams somewhat by eliminating the pipe stages for instruction fetch, register file fetch, and register file write, which can be understood to precede and follow the portions of the pipelines diagrammed. The diagram above is shown again in this new format, showing that the canonical pipeline has very little overlap of the actual execution of instructions.



Tue, Aug 17, 1999

Yeus Processor

superscalar pipeline is one capable of simultaneously issuing two or more instructions which are independent, in that they can be executed in either order and separately, producing the same result as if they were executed serially. The diagram below shows a two-way superscalar processor, where one instruction may be a register-to-register operation (using stage E) and the other may be a register-to-register operation (using stage A) or a memory load or store (using stages A and M).



A superpipelined pipeline is one capable is issuing simple instructions frequently enough that the result of a simple instruction must be independent of the immediately following one or more instructions. The diagram below shows a two-rycle superpipelined implementation:



In the diagrams below, pipeline stages are labelled with the type of instruction that may be performed by that stage. The position of the stage further identifies the function of that stage, as for example a load operation may require several L stages to complete the instruction.

#### Superstring Pipeline

Zeus architecture provides for implementations designed to fetch and execute several instructions in each clock cycle. For a particular ordering of instruction types, one instruction of each type may be issued in a single clock cycle. The ordering required is Λ, L, F, S, B; in other words, a register-to-register address calculation, a memory load, a register-to-register data calculation, a memory store, and a branch. Because of the organization of the pipeline, each of these instructions may be serially dependent. Instructions of type B include the fixed-point execute-phase instructions as well as floating-point and digital signal processing instructions. We call this form of pipeline organization "superstring." because of the ability to issue a string of dependent instructions in a single clock cycle, as distinguished

<sup>&</sup>lt;sup>4</sup>Readers with a background in theoretical physics may have seen this term in an other, unrelated, context.

from superscalar or superpipelined organizations, which can only issue sets of independent instructions.

These instructions take from one to tour evoles of latency to execute, and a branch prediction mechanism is used to keep the pipeline filled. The diagram below shows a box for the interval between issue of each instruction and the completion. Bold letters mark the estitical latency paths of the instructions, that is, the periods between the required availability of the source registers and the earliest availability of the result registers. The  $\Lambda$ -L critical latency path is a special case, in which the result of the  $\Lambda$  instruction may be used as the base register of the L instruction without penalty. E instructions may require additional cycles of latency for certain operations, such as fixed-point multiply and divide, floating-point and digital signal processing operations.



### Superspring Pipeline

Leus architecture provides an additional refinement to the organization defined above, in which the time permitted by the pipeline to service load operations may be flexibly extended. Thus, the front of the pipeline, in which A, L and B type instructions are handled, is decoupled from the back of the pipeline, in which E, and S type instructions are handled. This decoupling occurs at the point at which the data cache and its backing memory is referenced; similarly, a FIFO that is filled by the instruction fetch unit decouples instruction cache references from the front of the pipeline shown above. The depth of the FIFO structures is implementation-dependent, i.e. not fixed by the architecture.

Tuc, Aug 17, 1999

Zeus Processor

The diagram below indicates why we call this pipeline organization feature "superspring," an extension of our superstring organization.



With the super-spring organization, the latency of load instructions can be hidden, as execute instructions are deferred until the results of the load are available. Nevertheless, the execution unit still processes instructions in normal order, and provides precise exceptions.



#### Superthread Pipeline

This technique is not employed in the initial Zeus implementation, though it was present in an earlier prototype implementation.

A difficulty of superpipelining is that dependent operations must be separated by the latency of the pipeline, and for highly pipelined machines, the latency of simple operations can be quite significant. The Zeus "superthread" pipeline provides for very highly pipelined implementations by alternating execution of two or more independent threads. In this context, a thread is the state required to maintain an independent execution; the architectural state required is that of the register file contents, program counter, privilege level, local TB, and when required, exception status. Ensuring that only one thread may handle an exception at one time may minimize the latter state, exception status. In order to ensure that all threads make reasonable forward progress, several of the machine resources must be scheduled fairly.

An example of a resource that is entical that it be fairly shared is the data memory/cache subsystem. In a prototype implementation, Zeus is able to perform a load operation only on every second cycle, and a store operation only on every fourth cycle. Zeus schedules these fixed timing resources fairly by using a round-robin schedule for a number of threads that is relatively prime to the resource reuse rates. For this implementation, five simultaneous threads of execution ensure that resources which may be used every two or four cycles are fairly shared by allowing the instructions which use those resources to be issued only on every second or fourth issue slot for that thread.

In the diagram below, the thread number which issues an instruction is indicated on each clock cycle, and below it, a list of which functional units may be used by that instruction. The diagram repeats every 20 cycles, so cycle 20 is similar to cycle 0, cycle 21 is similar to cycle 1, etc. This schedule ensures that no resource conflict occur between threads for these resources. Thread 0 may issue an E, L, S or B on cycle 0, but on its next opportunity, cycle 5, may only issue E or B, and on cycle 10 may issue E, L or B, and on cycle 15, may issue E or B.

|       |   |   |   | _ |   |    |     |     |     | 46.11 |       |    |     |    |    |     |     |    |     |          |
|-------|---|---|---|---|---|----|-----|-----|-----|-------|-------|----|-----|----|----|-----|-----|----|-----|----------|
| Out   | 0 |   | 1 | 3 | 4 | 5  |     | 7   |     | 9     | 10    | ш  | 115 | 13 | 14 | 113 | 1.6 | 77 | TIA | ाका      |
| DAMAG | 0 | 1 | 2 | 3 | 4 | 0  | T   | 2   | 3   | 4     | 0     | 1  | 2   | 3  | 4  | 0   | 1   | 2  | 3   | 4        |
|       | ш | Ε | Ε | Ε | Ε | Ε  | Ε   | Ε   | Ε   | Ε     | E     | E  | E   | E  | E  | Ē   | Ε   | Ē  | E   | Ē        |
|       | 4 |   | L |   | L |    | L   |     | L   |       | L     |    | L   |    | L  |     | T   |    | Ī   |          |
|       | S |   |   |   | S |    |     |     | S   |       |       |    | S   |    |    |     | 5   |    |     | $\sqcap$ |
|       | В | В | В | В | В | В  | В   | В   | В   | В     | В     | В  | В   | В  | В  | В   | В   | В  | В   | B        |
|       |   |   |   |   |   | Si | per | thr | ead | Di    | pelii | ne |     |    |    |     |     |    |     |          |

When seen from the perspective of an individual thread, the resource use diagram looks similar to that of the collection. Thus an individual thread may use the load unit every two instructions, and the store unit every four instructions.

| - 17 | -            |     |    |    |    | -  |    |    | _  | -  |    |      |    |     |    | -  |     |     |    |    |
|------|--------------|-----|----|----|----|----|----|----|----|----|----|------|----|-----|----|----|-----|-----|----|----|
|      | 4            | 2   | 10 | 15 | 20 | 72 | 30 | 35 | 40 | \$ | 50 | 55   | 60 | 65  | 70 | 75 | 100 | 185 | 90 | 95 |
| ع إن | )            | 0   | 0  | 0  | 0  | 0  | 0  | 0  | 0  | 0  | 0  | 0    | 0  | 0   | 0  | 0  | 0   | 0   | 0  | Ō  |
| E    |              | E   | E  | E  | ш  | ш  | Ε  | Ε  | E  | Ε  | Ε  | Ε    | E  | Ε   | E  | E  | E   | E   | E  | E  |
| L    | $\mathbf{I}$ | 7 ( | L  |    | 7  |    | L  |    | L  |    | L  | T 12 | L  | 100 | L  |    | L   |     | I  |    |
| 3    | $\Box$       |     |    |    | S  |    |    |    | S  |    |    |      | 5  |     |    |    | S   |     |    |    |
| E    | 3]           | В   | В  | В  | В  | В  | В  | В  | В  | В  | В  | В    | В  | В   | В  | 8  | В   | В   | В  | В  |

Superthread pipeline

A Zeus Superthread pipeline, with 5 simultaneous threads of execution, permits simple operations, such as register-to-register add (G.ADD), to take 5 cycles to complete, allowing for an extremely deeply pipelined implementation.

### Simultaneous Multithreading

The initial Zeus implementation performs simultaneous multithreading among 4 threads. Each of the 4 threads share a common memory system, a common T unit. Pairs of threads share two G units, one X unit, and one E unit. Each thread individually has two  $\Lambda$  units.  $\Lambda$  fair allocation scheme balances access to the shared resources by the four threads.

Tue, Aug 17, 1999

Zeus Processor

#### Branch/fetch Prediction

Zeus does not have delayed branch instructions, and so relies upon branch or fetch prediction to keep the pipeline full around unconditional and conditional branch instructions. In the simplest form of branch prediction, as in Zeus's first implementation, a taken conditional backward (toward a lower address) branch predicts that a future execution of the same branch will be taken. More elaborate prediction may cache the source and target addresses of multiple branches, both conditional and unconditional, and both forward and reverse.

The hardware prediction mechanism is tuned for optimizing conditional branches that close loops or express frequent alternatives, and will generally require substantially more cycles when executing conditional branches whose outcome is not predominately taken or not-taken. For such cases of unpredictable conditional results, the use of code that avoids conditional branches in favor of the use of compare-set and multiplex instructions may result in greater performance.

Under some conditions, the above technique may not be applicable, for example if the conditional branch "guards" code which cannot be performed when the branch is taken. This may occur, for example, when a conditional branch tests for a valid (non-zero) pointer and the conditional code performs a load or store using the pointer. In these cases, the conditional branch has a small positive offset, but is unpredictable. A Zeus pipeline may handle this case as if the branch is always predicted to be not taken, with the recovery of a misprediction causing cancellation of the instructions which have already been issued but not completed which would be skipped over by the taken conditional branch. This "conditional-skip" optimization is performed by the initial Zeus implementation and requires no specific architectural feature to access or implement.

A Zeus pipeline may also perform "branch-return" optimization, in which a branch-link instruction saves a branch target address that is used to predict the target of the next returning branch instruction. This optimization may be implemented with a depth of one (only one return address kept), or as a stack of finite depth, where a branch and link pushes onto the stack, and a branch-register pops from the stack. This optimization can eliminate the misprediction cost of simple procedure calls, as the calling branch is susceptible to hardware prediction, and the returning branch is predictable by the branch-return optimization. Like the conditional-skip optimization described above, this feature is performed by the initial Zeus implementation and requires no specific architectural feature to access or implement.

Zeus implements two related instructions that can eliminate or reduce branch delays for conditional loops, conditional branches, and computed branches. The "branch-hint" instruction has no effect on architectural state, but informs the instruction fetch unit of a potential future branch instruction, giving the addresses of both the branch instruction and of the branch target. The two forms of the instruction specify the branch instruction address relative to the current address as an immediate field, and one form (branch-hint-immediate) specifies the branch target address relative to the current address as an immediate field, and the other (branch-hint) specifies the branch target address from a general register. The branch-hint-immediate instruction is generally used to give advance notice to the instruction

fetch unit of a branch-conditional instruction, so that instructions at the target of the branch can be fetched in advance of the branch-conditional instruction reaching the execution pipeline. Placing the branch hint as early as possible, and at a point where the extra instruction will not reduce the execution rate optimizes performance. In other words, an optimizing compiler should insert the branch-hint instruction as early as possible in the basic block where the parcel will contain at most one other "front-end" instruction.

### Additional Load and Execute Resources

Studies of the dynamic distribution of Zeus instructions on various benchmark suites indicate that the most frequently issued instruction classes are load instructions and execute instructions. In a high-performance Zeus implementation, it is advantageous to consider execution pipelines in which the ability to target the machine resources toward issuing load and execute instructions is increased.

One of the means to increase the ability to issue execute-class instructions is to provide the means to issue two execute instructions in a single-issue string. The execution unit actually requires several distinct resources, so by partitioning these resources, the issue capability can be increased without increasing the number of functional units, other than the increased register file read and write ports. The partitioning favored for the initial implementation places all instructions that involve shifting and shuffling in one execution unit, and all instructions that involve multiplication, including fixed-point and floating-point multiply and add in another unit. Resources used for implementing add, subtract, and bitwise logical operations may be duplicated, being modest in size compared to the shift and multiply units, or shared between the two units, as the operations have low-enough latency that two operations might be pipelined within a single issue cycle. These instructions must generally be independent, except perhaps that two simple add, subtract, or bitwise logical instructions may be performed dependently, if the resources for executing simple instructions are shared between the execution units.

One of the means to increase the ability to issue load-class instructions is to provide the means to issue two load instructions in a single-issue string. This would generally increase the resources required of the data fetch unit and the data cache, but a compensating solution is to steal the resources for the store instruction to execute the second load instruction. Thus, a single-issue string can then contain either two load instructions, or one load instruction and one store instruction, which uses the same register read ports and address computation resources as the basic 5-instruction string. This capability also may be employed to provide support for unaligned load and store instructions, where a single-issue string may contain as an alternative a single unaligned load or store instruction which uses the resources of the two load-class units in concert to accomplish the unaligned memory operation.

### Result Forwarding

When temporally adjacent instructions are executed by separate recources, the results of the first instruction must generally be forwarded directly to the resource used to execute the second instruction, where the result replaces a value which may have been fetched from a register file. Such forwarding paths use significant resources. A Zeus implementation must

Tuc, Aug 17, 1999

Zeus Processor

generally provide forwarding resources so that dependencies from earlier instructions within a string are immediately forwarded to later instructions, except between a first and second execution instruction as described above. In addition, when forwarding results from the execution units back to the data fetch unit, additional delay may be incurred.

# Instruction Set

This section describes the instruction set in complete architectural detail. Operation codes are numerically defined by their position in the following operation code tables, and are referred to symbolically in the detailed instruction definitions. Entries that span more than one location in the table define the operation code identifier as the smallest value of all the locations spanned. The value of the symbol can be calculated from the sum of the legend values to the left and above the identifier.

Instructions that have great similarity and identical formats are grouped together. Starting on a new page, each category of instructions is named and introduced.

The Operation codes section lists each instruction by mnemonic that is defined on that page. A textual interpretation of each instruction is shown beside each mnemonic.

The Equivalences section lists additional instructions known to assemblers that are equivalent or special cases of base instructions, again with a textual interpretation of each instruction beside each mnemonic. Below the list, each equivalent instruction is defined, either in terms of a base instruction or another equivalent instruction. The symbol between the instruction and the definition has a particular meaning. If it is an arrow ( $\leftarrow$  or  $\rightarrow$ ), it connects two mathematically equivalent operations, and the arrow direction indicates which form is preferred and produced in a reverse assembly. If the symbol is a ( $\Leftarrow$ ), the form on the left is assembled into the form on the right solely for encoding purposes, and the form on the right is otherwise illegal in the assembler. The parameters in these definitions are formal; the names are solely for pattern-matching purposes, even though they may be suggestive of a particular meaning.

The Redundancies section lists instructions and operand values that may also be performed by other instructions in the instruction set. The symbol connecting the two forms is a  $(\Leftrightarrow)$ , which indicates that the two forms are mathematically equivalent, both are legal, but the assembler does not transform one into the other.

The Selection section lists instructions and equivalences together in a tabular form that highlights the structure of the instruction ninemonics.

The Format section lists (1) the assembler format, (2) the C intrinsics format, (3) the bit-level instruction format, and (4) a definition of bit-level instruction format fields that are not a one-for-one match with named fields in the assembler format.

The Definition section gives a precise definition of each basic instruction.

The Exceptions section lists exceptions that may be caused by the execution of the instructions in this category.

Tue, Aug 17, 1999

Instruction Ser Major Operation Codes

# Major Operation Codes

All instructions are 32 bits in size, and use the high order 8 bits to specify a major operation code.

|   | 31 24 | 230   |
|---|-------|-------|
| • | major | other |
|   | 8     | 24    |

The major field is filled with a value specified by the following table:5

| MAJOR | 0             | 32          | - 44    | 96                  | 176            | 160       | 192          | 224          |
|-------|---------------|-------------|---------|---------------------|----------------|-----------|--------------|--------------|
| 0     | ARES          | MF16        | LIIAL   | 914                 |                | *DE/OS/I  | EMAN         | WALLMATER    |
|       | Mod           | M717        | UI 4    | SILE                | 8000           |           | MUM          | WANDLAND     |
| 7     | M0010         | HILL        | UIAC    | प्राध्य             | 6/00/0         |           | FLAUVEL      | WILL DO ST   |
|       | WOOUG         | MITTH       | UILLE   | धार्क               | 80000          |           | EMACRE       | MARA MATERIA |
| •     |               | 4016        | LIJA    | PIA                 |                | IDE/OSTU  | EMALADOM     | WALL MATER   |
| 5     | ALL           | 1017        | UIX     | 87.38               | CEL S          |           | BALADOW      |              |
|       | AUSTO         | KOU         | LITZAL  | MIX                 | OF UNIO        |           | ENUCCOUNT    | THE WATER    |
| 7     | AGUEUS        | ROTH .      | UIZA    | 9336                | OLUMUS .       |           | ENAULACONOC  |              |
| . 4   | ALTE          | 8216        | UNAL    | Sust                | CATH           | XMCHCHAN  | ECONON       |              |
|       | ALTHE         | AJ 13       | UA      | THE STATE OF        | CHETTAR        |           | KOKO         |              |
| 10    | ALITANOEI     | <b>D</b>    | IMAG    | SHA                 | GEVADO         |           | KONGU        |              |
|       | METANDRES     | BU/126      | THAT    | 944                 | GENADIA.       |           | KONGU        |              |
| 12    | ALTU          | BOSF 14     | LI124L  | \$126L              | dictu          | COM COM   | ECONODIA     |              |
| 13    | AUTOS         | REF 13      | LIIZE   | 91266               | CHICH          |           | KONS         |              |
| 14    | AUTU          | BOHU        | UIZER   | BI SEE              | CATTLAN        |           | EC04000      |              |
| 15    | AUTON         | BC29178     | UITEU   | 91364               | <b>GETTERN</b> |           | KONKE        |              |
| 16    | AWG           |             | LUIN    | Stewarts            | OVAS           | HOE/OS/TM | ESCALADOF IA | VACA H MATE  |
| 17    | MMOI          | M           | UU166   | SKERAN              | CHADI          |           | ESCALADOF NZ |              |
| 18    | AON           | AMOF        | LIVILAL | SCHAR               | 60%            |           | ESCALADOF    |              |
| 19    | MOB           | BWONE       | WIN     | KSHAM               | COD            |           |              | WALMUG       |
| 30    | ANON          | <b>8</b> 4  | KEUU    | SUBSHILL            | 600            | 25M2714   | EMAGA        |              |
| 21    | MAJE          | dCd.        | UUSSE   | Wille               | CHAR           |           | EMAGE        |              |
| 22    |               | AU          | LUIZAL  | SARAHAN             | CACOLAN        |           | EMAX         |              |
| 23    |               | BCEU        | LIULIAS | STATE OF THE PARTY. |                |           | EDCINCI      |              |
| 24    | KOM           | De 12       | LANAL   |                     | OCOPH          | MEXTRACT  | EUTRACTI     |              |
| 25    |               | DAF 32      | LILIMA  | <del> </del>        |                | SHOW      | EEXTRACTIV   |              |
| 76    |               | <b>≥</b> 12 | LEMAN   |                     |                |           | EEATACTA     | WTABLEL      |
| 27    |               | De 17       | LLLIAN  |                     | 6              |           | - //         | WIABLEL      |
| 78    |               |             | LIB     | Ç.                  | 014            | Beaffil   | £16          | WINDLES      |
| 19    |               | RPE         | LAA     |                     | 935            | KV-STI    | E32          | WSWTICHE     |
| 30    |               | pert:       |         | ·                   | 044            | 20-61     | E 37         |              |
| - 11  | ALL PROPERTY. | MANOR.      | LANCE T | GALLY OF            | 8178           | A3-01     | E 126        | WANOR        |

major operation code field values

<sup>&</sup>lt;sup>5</sup>Hlank table entries cause the Reserved Instruction exception to occur.

### Minor Operation Codes

For the major operation field values A.MINOR, B.MINOR, L.MINOR, S.MINOR, G.8, G.16, G.32, G.64, G.128, XSHIFTI, XSHIFT, E.8, E.16, F.32, F.64, F.128, W.MINOR.I. and W.MINOR.B, the lowest-order six bits in the instruction specify a minor operation code:



The minor field is filled with a value from one of the following tables:

| AMINOR | 0      | 8     | 16          | 24         | 32 | 40      | 48      | 56  |
|--------|--------|-------|-------------|------------|----|---------|---------|-----|
| 0      |        | AMO   | ASETE       | ASETEF     |    | ASHEI   | ASTEMOD |     |
| _      | MOD    | AXOR  | ASETNE      | ASETLGE    |    | 1       |         |     |
|        | M000   | AOR   | ASETANDE    | ASETLE     |    | ASHLIO  | 1       |     |
| 3      | MODUO  | AMON  | ASSTANDAS   | ASETGEF    |    | ASHLIUO |         |     |
| •      |        | AORN  | ASSTULZ.    | METEL X    | -  |         | ASHLEUM |     |
| 5      | ASUB   | AUNOR | ASETGE/GEZ  | ASETE GF X |    | 1       |         |     |
|        | ASUMO  | ANOR  | ASETILUAGE  | ASETURX    |    | ASH491  | 1       |     |
| 7      | ASUBUO | MAND  | ASETGEU LEZ | ASETGEF X  |    | ASHEDU  | 1       | KON |

minor operation code field values for AMINOR

| BLMINOR | 0            | 8 | 16 | 24 | 32 | 40 | 48 | 54 |
|---------|--------------|---|----|----|----|----|----|----|
| 0       |              |   |    |    |    |    |    |    |
|         | B. JAK       |   |    |    |    |    |    |    |
| _       | 2007         |   |    |    |    |    |    |    |
| 3       | MOOWN        |   |    |    |    |    |    |    |
| 4       | <b>BGAYE</b> |   |    |    |    |    |    |    |
| 3       | MACE         |   |    |    |    |    |    |    |
| •       | BULT         |   |    |    |    |    |    |    |
|         | MARKET       |   |    |    |    |    |    |    |

minor operation code field values for B.MINOR

| LMINOR | 0     |        | 16    | 24     | 32 | 40 | 48 | 5-6         |
|--------|-------|--------|-------|--------|----|----|----|-------------|
| 0      | LIM   | 1641   | IU:M  | 10441  |    |    |    |             |
|        | LIAM  | 1640   | LUISE | LULAB  |    |    |    |             |
| _7_7   | 1164  | LMAL   | LUILA | LUMAN  |    |    |    | <del></del> |
| 3      | LIM   | IMM    | LUILL | LUMANS |    |    |    |             |
| 4      | 137   | 1130L  | win   | LO     |    |    |    |             |
| _ 3    | -CIM  | 11768  | 10378 | LUI    |    |    |    |             |
| _•     | LISAL | LIZBAL | LUIZA | 7      |    |    |    |             |
| 7      | LIZA  | LIZEM  | LUTZA | ·      |    |    |    |             |

minor operation code field values for L.MINOR

Tuc, Aug 17, 1999

Instruction Set
Minor Operation Codes

| SMINOR | 0      |         | 1 14         | 24       | 32 | 40 | 48 | 54 |
|--------|--------|---------|--------------|----------|----|----|----|----|
| 0      | STAL   | 564     | SASSAN       |          |    |    |    |    |
|        | 3140   | 44      | SHIM         | 1        |    |    |    |    |
| 2      | TIER   | SAAL    | SCHOOL STATE | 202244   |    |    |    |    |
| -      | 3124   | 4448    | WWW.         | SOCSIANO |    |    |    |    |
| 4      | \$322  | \$174L  | SARATA       | 9        |    |    |    |    |
| 3      | 5378   | \$1700  | SUSSELLA     |          |    |    |    |    |
| 6      | \$32A  | 31264   | MADMA        |          |    |    |    |    |
|        | \$12AP | 3176/40 | SADMA        |          |    |    |    |    |

minor operation code field values for S.MINOR

| Gwe | 0     | <br>14              | 24        | 32      | 40                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             | 48      | 56             |
|-----|-------|---------------------|-----------|---------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|---------|----------------|
| 0   |       | <br>OUT             | GLUB      | CADOM   | CHARLES AN                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     | CE CUCO | 6400           |
|     | 2000  | <br>CHINE           | 041.0     | C/CG5-0 | OSUB-E                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         |         | COOL           |
| . ≥ | 64000 | <br>GEWEE           | GETTU     | GOOF    | CSUB F                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         |         | <del>cuu</del> |
| 3   | GGGGG | armore              | GUNGE     | टळार    | COLUMN TO THE PERSON TO THE PE | 1       |                |
| 4   |       | <br>OUT CLU         | GUETRY IS | COOL    | CALANIAN                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       | CALLERA | COLAL          |
| 5   | GAJO  | <br>and/d2          | CLINO X   | GOOD    | CALADO                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         |         | CALLAU         |
| 6   | GAUSO | <br><b>BUTTURGZ</b> | GUTUT     | 60000   | GRUE UF                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        | 1       | CASA           |
| 7   | OSUMO | CONTRACTOR OF       | GELTCE I  | 8488-02 | CHAUC                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          |         | CCOM           |

minor operation code field values for G.size

| XSHIFTI | 0      |        | 16       | 24     | 32    | 49       | 48    | 56         |
|---------|--------|--------|----------|--------|-------|----------|-------|------------|
| 0       | NSHU . | IS-4UÖ |          | XS-480 |       | EXMICI   |       | PCOMPRESSI |
|         | KVC10  | 754.00 | i Sandha | N9-BU  | XIOYU | SE SAMON | XAOYA | HOWEIG     |

minor operation code field values for XSHIFTI

| XSHIFT | 0     |       | 16     | 24    | 32    | 40       | 48    | 56          |
|--------|-------|-------|--------|-------|-------|----------|-------|-------------|
| )<br>  | XSHL  | W40   |        | XS-48 |       | XEXPANO  |       | AC OMPRESS  |
| 6<br>7 | r. en | KHŁÓU | XL-404 | XSHBU | zicax | XEXPANDU | KROTR | AC OMPRESSU |

minor operation code field values for XSHIFT

| l we        | 0        |           | 16     | 24      | 32        | 40        | 48         | 56    |
|-------------|----------|-----------|--------|---------|-----------|-----------|------------|-------|
| 0           | EMAIN    | EMALADOHN | LADOFN | ESLAN   | EMAL      | EMALADO   | EDITAN     | ECON  |
|             | THATT    | THATOOTY  | E/OOF? | 150.007 | EMULU     | EMULACION | EDMI       | KON   |
| 7           | THANT    | IMALADON  | WOOT   | PROPER  | EMAM      | EMULADOM  | EDMI       | Kom   |
| )           | TMAK     | PALACOTE  | UCCH   | PSLANC. | PMLC      | HALACOX   | EDMC       | KON   |
| 4           | TMA/X    | EMALADOFX | LOOFE  | ESLANX  | FMULSUM   | EMAL SAME | EDMIX      | EDW   |
|             | FMALE    | [MALADO!  | TABO   | ESUM    | EMULSUMU  | UNULLUNU  | EDIN       | (ONU  |
| 6           | IMACI    | INJUNDET  | KONI   | ROKE    | THULSUNAN | EMUS SUMS | EMAL SLIME | (MAL) |
| <del></del> | THE WART | FMAL WACE | RON    | KONCFE  | FARE LAK  | EMALSUNC  | ENAUL SUM  | FUNNE |

minor opr. ation code field values for E.size

| A PRIOR DATE | 0            |               | 16        | 74          | 40   | 74 | 7/ |
|--------------|--------------|---------------|-----------|-------------|------|----|----|
| 0            | ACTOR A      | WILL BUILD    |           |             | <br> |    |    |
|              |              | STATESTINE.   |           | <del></del> | <br> |    |    |
| 7            | WILLIAM      | WALL DRIVED   | WILLIAMS. | <del></del> | <br> |    |    |
| 7            | TELEMENT     | WOLLDWAY.     | WILL WINE |             | <br> |    |    |
| -            | AND THOUSAND | WOODWICE      |           | WILLIAM !   | <br> |    |    |
|              |              | WULTINGETS!   |           |             | <br> |    |    |
|              |              | WELLINGTON TO |           |             | <br> |    |    |
| 7            | WILLIAME     | WAINGLA       | WHITHRY ! | THATTERED   | <br> |    |    |

minor operation code field values for W.MINOR.L or W.MINOR.B

For the major operation field values E.MUI\_X.I., E.MUI\_X.I.U., P.MUI\_X.I.M., E.MUI\_X.I.C., E.MUI\_ADD.X.I., E.MUI\_ADD.X.I.U., E.MUI\_ADD.X.I.M., E.MUI\_ADD.X.I.U., E.MUI\_ADD.X.I.M., E.CON.X.I.U.I., E.CON.X.I.U.I., E.CON.X.I.U.I., E.CON.X.I.U.I., E.CON.X.I.U.I., E.CON.X.I.U.I., E.CON.X.I.C.I., E.CON.X.I.C.I., E.CON.X.I.C.I., E.CON.X.I.C.I., W.MUI\_MAT.X.I.U.I., W.MUI\_MAT.X.I.U.I., W.MUI\_MAT.X.I.U.I., W.MUI\_MAT.X.I.C.I., and W.MUI\_MAT.X.I.C.I., another six bits in the instruction specify a minor operation code, which indicates operand size, rounding, and shift amount.

| 31    | 24 23 |       | 5 | 0     |
|-------|-------|-------|---|-------|
| major |       | other |   | minor |
| 8     |       | 18    |   | 4     |

The minor field is filled with a value from the following table: Note that the shift amount field value shown below is the "sh" value, which is encoded in an instruction-dependent manner from the immediate field in the assembler format.

| XJ | . 0  |      | 16    | 74    | 32    | 40     | 46   | 56      |
|----|------|------|-------|-------|-------|--------|------|---------|
| 0  | WA   | 8.46 | 167,0 | ILA   | 121.0 | 32748  | ALLA | MANO    |
|    | W.I  | BA(I | 167.1 | ILAST | 177.1 | ואנו   | 777  | स्रम्   |
| 1  | 87.2 | 1707 | 167.7 | 16N7  | 177.7 | ואנו   | urr  | 64 14.7 |
| 3  | 17.1 | 121  | 167.3 | TANT  | 177.3 | 13.8(3 | MIT  | BANT    |
| 4  | 82.5 | 16.5 | 162.0 | 166,6 | 172.0 | 37.50  | 7.77 | 64 C.0  |
| 5  | 82.1 | 86.1 | 162.1 | 134   | 177.1 | 1261   | - in | बंदा    |
| 6  | 14.7 | 12.7 | 1623  | 1867  | 1227  | 1262   | 77-  | uci     |
| 7  | 321  | 16.1 | 162.1 | 166.1 | 1271  | 1771   | 777  | 142     |

minor operation code field values for EMULXI, EMULXIU, EMULXIM. EMULXIC, EMULADDXI, EMULADDXIU, EMULADDXIM, EMULADDXIC, ECONXIL, ECONXIB, ECONXIUL, ECONXIUB, ECONXIMB, ECONXICL, ECONXICB, EEXTRACTI, EEXTRACTIU, WMULMATXIUL, WMULMATXIUB, WMULMATXIML, '//MULMATXIMB, WMULMATXICL, and WMULMATXICB,

For the major operation field values GCOPYI, two bits in the instruction specify an operand size:



Tuc, Aug 17, 1999

Instruction Set Mino Operation Codes

For the major operation field values G.AND.I, G.NAND.I, G.NOR.I, G.OR.I, G.XOR.I, G.ADD.I, G.ADD.I.O, G.ADD.I.UO, G.SET.AND.E.I, G.SET.AND.NE.I, G.SET.E.I, G.SET.GE.I., G.SET.L.I.U, G.SUB.I.O, G.SUB.I.UO, two bits in the instruction specify an operand size:

| 31 | 24 23 | 18 17 | 12 | 11 1 | 9   |
|----|-------|-------|----|------|-----|
| ОР |       | ď     | rc | SZ   | lmm |
| 8  |       | 6     | 6  | 2    | 10  |

The sz field is filled with a value from the following table:

| ٧ | MY  |
|---|-----|
| 6 | 16  |
|   | 72  |
|   | и   |
|   | 128 |

operand size field values for G.COPY.I, GAND.I, G.NAND.I, G.NOR.I, G.OR.I, G.XOR.I, GADD.I.O, GADD.I.UO, G.SETAND.E.I, G.SETAND.NE.I, G.SET.E.I, G.SET.GE.I, G.SET.L.I.U, G.SUB.I, G.SUB.I.UO

For the major operation field values E.8, E.16, E.32, E.64, E.128, with minor operation field value E.UNARY, another six bits in the instruction specify a unary operation code:

|       | 24 23 | 18 17 | 12 11 | 6     | 5     |
|-------|-------|-------|-------|-------|-------|
| major | rd    |       | c     | unary | minor |
| 8     | 6     |       | 6     | 6     |       |

The unary field is filled with a value from the following table:

| UNWIT | 0       |                 | 16      | 24       | 1 12      | 40               | 4        |   |
|-------|---------|-----------------|---------|----------|-----------|------------------|----------|---|
| 0     | 100MM   | ESUMEN          | ESMESH  | LALONIA  | LOWWHH    | ESUM             |          |   |
|       | Now1    | STATE           | \$20-77 | HIONTY   |           |                  | 1999070  | - |
|       | LICONI  | PARA            | 139431  | FFLOATH  |           |                  | סוטאעה   |   |
|       | TYOUR . | SATURE.         | STANTE  |          |           | <b>FLOOMOSTU</b> |          |   |
|       | NON     | LEDYNAX         | EZHEFY  | EFLOATEX | EDEPLOYER |                  | 12/14/15 |   |
| 5     | L/OW    | TAUM.           | UNIV    | FILOUT   | EDEFLORE  |                  |          |   |
| -6    | INCAMIN | <b>EMCESSAR</b> | EASS/X  | ENEGRY   | ENGLARIES |                  | ROWN     |   |
| ,     | TROMIN  | THEEST          | (Alg    | ENEG     | ENVIOLE   |                  | ROPY     |   |

unary operation code field values for E.UNARY.size

For the major operation field values A.MINOR and G.MINOR, with minor operation field values A.COM and G.COM, another six bits in the instruction specify a comparison operation code:



Tue, Aug 17, 1999

Instruction Set General Forms

The compare field is filled with a value from the following table:

| T CCM | . 0           |         | 16 | 7 | 137 | 46 | - 44 |  |
|-------|---------------|---------|----|---|-----|----|------|--|
| 0     | 1 3 3 M       | ICOME!  |    |   |     |    |      |  |
|       | <b>RUM</b>    | KOMO    |    |   |     |    |      |  |
|       | <b>LEGMAD</b> | ROM !   |    |   |     |    |      |  |
| - 1   | Kimor         | KOMOSI  |    |   |     |    |      |  |
| -     | T KOM         | ICOMP X |    |   |     |    |      |  |
| 3     | NCONO!        | KOLOX   |    |   |     |    |      |  |
|       |               | 7.04    |    |   |     |    |      |  |
|       | I ACOUNTU     | CONGELY |    |   |     |    |      |  |

compare operation code i...ld values for ACOM.op and G.COM.op.size

# General Forms

The general forms of the instructions coded by a major operation code are one of the following:



The general forms of the instructions coded by major and minor operation codes are one of the following:



The general form of the instructions coded by major, minor, and unary operation codes is the following:



Tuc, Aug 17, 1999

Instruction Set

Register rd is either a source register or destination register, or both. Registers re and rb are always source registers. Register ra is always a destination register.

### Instruction Fetch

```
Definition
del Threadith as
     for ever do
          catch exception
               # (EventRegister & EventMask(th)) # () then
                     # ExceptionState=0 then
                          raise EventInterrupt
               ende
               inst ← LoadMemoryXIProgramCounter,ProgramCounter,32,LJ
               Instruction(inst)
          case exception of
               Evendnterrupt
               Reserved Instruction, Access Disallowed By Artual Address,
               AccessDisaflowedByTag,
                AccessDisallowedByGlobalTB,
               AccessDisalloweoByLocalTB, AccessDetailRequiredByTag,
                AccessDetailRequiredByGlobalTB,
                AccessDetailRequiredByLocalTB,
                MissinGlobalTB,
                MissinLocalTB.
                FixedPointAritivnetic,
                Floating/roint/v thrmetic,
                GatewayDirailowed:
                     case. ExceptionState of
                          œ
                                PerformException(exception)
                           1:
                                PerformException(SecondException)
                          2:
                                PerformMacrimeCheck[ThirdException]
                     endcase
                TakenBranch:
                     ContinuationState + [ExceptionState=0] 7 0 : ContinuationState
                TakenBranchContinue:
                     /" nothing "/
                none, others:
                     ProgramCounter + 4
                     ContinuationState + [ExceptionState=0] 7 0 : ContinuationState
           endcase
     endlorever
enddef
```

### Perform Exception

```
Definition
```

```
del Performexcepului Hexception) as
     v ← (exception > 7) ? : exception
     t - LoadMemory(Exceptic nBase, ExceptionBase+Thread*128+64+8*v,64,L)
     # ExceptionState = 0 then
           u ← RegRead(3,128) 1 i RegRead(2,128) 11 RegRead(1,128) 11 RegRead(0,128)
StoreMemory(ExceptionBise,ExceptionBase+Thread*128,512,Lul
           RegWrite[0,64,ProgramCcunter63,2 11 PrivilegeLevel
           RegWrite(1.64,Exception8 ise-Thread* 128)
           RegWrite(2.64, exception)
           RegWrite(3,64,FailingAdd: ess)
     endif
     PrivilegeLevel \leftarrow t_{1.0}
     ProgramCounter ← t<sub>63.2</sub> 11 0<sup>2</sup>
     case exception of
           AccessDetailRequiredByTag.
           AccessDetailRequiredByGlobalTB.
           AccessDetailRequiredByLocalT&:
                 ContinuationState + ContinuationState + 1
           others:
                /" nothing "/
     entcase
     ExceptionState - ExceptionState + 1
enddel
Instruction Decode
```

```
def instructionanst as
    major - inst31_24
    rd - inst23_18
    rc ← Inst17..12
    simm - rb - inst11.6
    mmor +- ra +- insts.o
    case minjor of
        ARES:
             AwaysReserved
        AMNOR
             minor ← insts. o
             case minor of
                  AADD, AADD.O, AADD.OU, AAND, AANDN, ANAND, ANOR
                  AOR AORN, AXNOR AXOR
                       Address[minor,rd,rc,rb]
                  ACOM:
                       compare ← inst<sub>11.6</sub>
                       case compare of
                           ACOMIE, ACOMINE, ACOMIANDIE, ACOMIANDINE,
                           ACOMIL ACOMIGE, ACOMILU, ACOMIGEIU:
                                AddressCompare(compare,rd,rc)
                           others:
                                raise ReservedInstruction
```

endcase ASUB ASUBO, ASUBU.O. ASET AND E ASET AND ME, ASET E ASET ME ASET L. ASET.GE. ASET LU. ASET.GE.U. AddressReversed(minor,rd,rc,rb) ASHLIADOLASHLIADO+3: AddressShirtLeftImmediateAddlinst | ...o.rd.rc.rbl ASHLISUB. ASHLISUB+3: AddressShiftLeftImmediateSubtract(inst 1..0.rd,rc,rb) ASHLI ASHLI.Q. ASHLI.U.O. ASHRIL ASHRI.U. AROTRI: AudressShiftImmediate(minor,rd,rc,simm) otheri. raise ReservedInstruction endcase ACOPY.I AddressCopylmmediate(major,rd,inst<sub>17\_0</sub>) AADDI AADDIO, AADDIUO, AANDI AORI ANANDI, ANORI AXORE Address/mmediate/major,rd,rc,inst | 1.0 ASETANDEL ASETANDAEL ASETEL ASETAEL ASETLL ESET.GEL ASETLUL ASET.GE.U.L ASUBL ASUBLIO, ASUBLUICE AddressimmediateReversed[major,rd,rc,inst]1\_0i AddressTernary(major,rd,rc,rb,ral B.MINOR: case minor of R. Branchfrd.rc.rbi B.BACK: BranchBackfrd.rc.rbl B.BARRIER BranchBarrier[rd,rc,rb] B.DOWN: BranchDown(rd,rc,rb) **B.GATE:** BranchGateway(rd,rc,rb) B.HALT: BranchHalterd,rc,rbj B.HINT: BranchHintfrd,inst<sub>17,12</sub>,simm B.LINK: BranchLinkfrd,rc,rbf others: raise ReservedInstruction BE, BNE, BL, BGE, BLU, BGE,U, BAND.RE: BranchConditional[major,rd,rc,inst11\_o] BHINT: BranchHintlmmediate(inst<sub>23\_18</sub>-inst<sub>17\_12</sub>,inst<sub>11\_0</sub>) Bt: Branchimmediatelinst23 of BLINKI: BranchimmediateLinklinst23...0 BEF16, BLGF16, BLF16, BGEF16.

BEF32, BLGF32, BLF32, BGEF32, BEF64, BLGF64, BLF64, BGEF64, BEF128, BLGF128, BLF128, BGEF128: BrinchCondibe half loabingPoint(major,rd,rc,inst;) ol BIF32, &NIF32, BNVF32, BVF32: BranchCondibonalVisibilityFloatingPoint(major,rd,rc,inst;) ol LMINOR

case minor of
L16L, LU16L, L32L, LU32L, L64L, LU64L, L128L, L8, LU8,
L16AL, LU16AL, L32AL, LU32AL, L64AL, LU64AL, L128AL,
L16B, LU16B, L32B, LU32B, L64B, LU64B, L128B,
L16AB, LU16AB, L32AB, LU32AB, L64AB, LU64AB, L128AB;
L03d[minor,rd,rc,rb]

others:

raise Reservedinstruction

en:kase 1116L L:U16L L:U32L L:U32L L:64L L:U64L L:1128L L:8, L:U8. L:116AL L:U16AL L:U32AL L:U32AL L:06AL L:U64AL L:1128AL L:116AR L:U16AR L:U32AR L:U32AR L:U64AR L:L:164AR L:I128AB: L: addimmediate(major,rd,rc,inst;;; o) SMINCR

C318 minor of

\$16L, \$32L, \$64L, \$128L, \$8,

\$16AL, \$32AL, \$64AL, \$128AL,

\$A\$64AL, \$C\$64AL, \$M\$64AL, \$M64AL,

\$16B, \$32B, \$64B, \$128B,

\$16AB, \$32AB, \$64AB, \$128AB,

\$A\$64AB, \$C\$64AB, \$M\$64AB, \$M64AB;

\$tore|minor,rd,rc,rb|

\$DC\$64AB, \$DC\$64AL;

\$toreDoubleCompare\$wsip|minor,rd,rc,rb|

others: raise Reservedinstruction

endcase
\$1:6L, \$132L, \$164L, \$1128L, \$18,
\$1:6AL, \$132AL, \$164AL, \$1128AL,
\$A\$164AL, \$132AL, \$164AL, \$1128AL,
\$A\$164AL, \$C\$164AL, \$A\$164AL, \$MUXI64AL,
\$116B, \$132B, \$164B, \$1128B,
\$116AB, \$132AB, \$164AB, \$1128AB
\$A\$164AB, \$C\$164AB, \$M\$164AB, \$MUXI64AB;
\$toreimmediate[major,rd,rc,inst] 1.0]
G.B., G.16, G.32, G.64, G.128;

mmor ← inst<sub>5.0</sub> size ← 0 11 1 11 03+major-0.8 case minor of

> GADO, GADOLL, GADOLU, GADOLO, GADOLOU-Group/minor,size,rd,rc,rbj GADDHC, GADDHF, GADDHN, GADDHZ, GADDHUC, GADDHUF, GADDHUN, GADDHUZ: GroupAddHaive/minor,inst 1. u. size,rd,rc,rbj

> GAM, GASA:
> Groupinplace(minor, size, rd, rc, rh)
> GSETAND.E, G.SET.ND.NE, G.SET.E, G.SET.NE,
> GSET.L, G.SET.GE, G.SET.LU, G.SET.GE.U;
> GSUB, G.SUB.L, G.SUB.LU, G.SUB.O, G.SUB.U.O.
> GroupReversed(minor, size, ra, rb, rc)
> GSET.E.F., G.SET.LG.F., G.SET.GE.F., G.SET.L.F.,
> GSET.E.F.X., G.SET.LG.F.X., G.SET.GE.F.X., G.SET.L.F.X.
> GroupReversedFloatingPoint(minor.op., size,

```
minor round, rd, rc, rb)
           G.SHLIADD.G.SHLIADD+3,
                GroupShiftLeftImmediateAddfinst 1. Gisize,rd,rc,rbf
           G.SHLJ.SUB..G.SHLJ.SUB+3,
                GroupShiftLeftImmediateSubtracthrist 1..0.size.rd,/c,rbj
           G.SUBHC, G.SUBHF, G.SUBHN, G.SUBHZ,
           G SU'DHUC, G.SUBHUF, G.SUBHUN, G.SUBHUZ:
                GroupSubtrac*Halve(minor,inst | o.size,rd,rc,rb)
          G COM.
                compare - institué
                Case compare of G.COM.A. G.COM.AND.E. G.COM.AND.NE. G.COM.L. G.COM.GE.U.
                           GroupCompare(compare,size,ra,rb)
                     others:
                          raise ReservedInstruction
                endcase
           others:
                raise ReservedInstruction
     endcase.
G.BOOLE/N.G.200LEAN+1:
     GroupBoolean(major,rd,rc,rb,minor)
G.COPYJ...G.COPY.H1:
     size ← 0 11 1 11 04+47517.16
     GroupCopyImmediate(major,size,rd,inst<sub>15,0</sub>)
GANDI, GNANDI, GNORI, G.ORI, GXORI,
GNODI, GADDIO, GADDIUO.
     ME - 0 11 1 11 04-MS(11.10
     Group/mmediate/ najor,size,rd,rc,instq_d
G.SET.AND.E.L. G.SET.AND.NE.L. G.SET.E.L. G.SET.GE.L. G.SET.L.L.
G.SET.NE.L G.SET.GE.I.U, G.SET.L.I.U, G.SUB.L G.SUB.I.O, G.SUB.I.U.O:
     size ← 0 11 1 11 04+mst11 10
     Group/mmediateReversed/major,size,rd,rc,instq_gl
G.MUX.
     GroupTernary[major,rd,rc,rb,ra]
X SHIFT:
     minor \leftarrow inst<sub>5...2</sub> 11 0^2
     size - 0 | | 1 | | 0|inst24 | 1 |inst1...0|
     case minor of
          X.EXPAND, X.UEXPAND, X.SPL, X.SHL.O, X.SHL.U.O,
          X.ROTR, X.SHR, X.SHR.U.
                Crossbar(minor, size, rd ~.rb)
          X.SHL.M. X.SHR.M:
                Crossbarinplace(minor, size, rd, rc, rb)
           others:
                raise ReservedInstruction
     endcase
X.EXTRACT:
     CrossbarExtract(major.rd,rc,rb,ra)
X.DEPOSIT, X.DEPOSIT.U X.WITHDRAW X.WITHDRAW U
     CrossbarField(major,rd,rc,inst 11..6.inst5..0)
X.DEPOSIT.M:
     CrossbarFieldInplace(major,rd,rc,inst 11, 6, inst 5...0)
X.SHIFT.I:
```

minor - insts.o

erit a e la la

case minors 2 11 02 of

X.COMPRESS L. X.EXPAND L. X.ROTR L. X.SHL L. X.SHL LO. X.SHL LU.O.

Y. SHR L. X. CO'.:PRESS.LU. X.EXPAND.LU. X.SHR UI:

Crossbarshortimmediate(ninor,rd,rc,simm)

X.SHL M. L. X.SHR M.L.

Crossbarshortimmediate(nplace(minor,rd,rc,simm))

others.

raise Reservedinstruction

endcase
XSHUFFLE.XSI (UFFLE+1
CrossbarShuffle(major,rd,rc,rb,simm)
XSWIZZLE.XSWIZZLE+3
CrossbarSwizzle(major,rd,rc, inst [ ] 6 inst5. 0]
XSELE T7.8:
CrossbarTernary(major,rd,rc,rb,ra)
E.8. E.16. E.32, E.64, E.128:

minor  $\leftarrow$  insts 0 size  $\leftarrow$  0 11 1 11 03+map. E8 case minor of

E.CON. E.CON.U. E.CON.M. E.CON.C.
E.MUL. E.MUL.U. E.MUL.M. E.MUL.C.
E.MUL.SUM. E.MUL.SUM.U. E.MUL.SUM.M. E.MUL.SUM.C.
E.DV. E.DV.U. E.MUL.P.
Erisemblefminor,size,ra,rb,rc)

E.CONIFIL E.CONIFIL E.CONIC.F.L. E.CONIC.F.B.
EnsembleConvolveFloatingPoint/imnor.size.rd.rc.rb/
EADD.F.N. EMUL.C.F.N. EMUL.F.N. E.DIV.F.N.
EADD.F.Z. EMUL.C.F.Z. E.MUL.F.Z. E.DIV.F.Z.
EADD.F.Z. EMUL.C.F.F. E.MUL.F.F. E.DIV.F.F.
EADD.F.C. EMUL.C.F.C. E.MUL.F.C. E.DIV.F.C.
EADD.F. EMUL.C.F.C. EMUL.F.Z. E.DIV.F.Z.
EADD.F.X. EMUL.C.F.X. EMUL.F.X. E.DIV.F.X.
EnsembleFloatingPoint/Imnor.op. major.size. minor.round. rd. rc. rb/
EnsembleFloatingPoint/Imnor.op. major.size. minor.round. rd. rc. rb/

E.MULADD, C.MULADD.U, E.MULADD.M, E.MULADD.C: EnsembleInplace(minor, size, rd, rc, rb) E.MULSUB, E.MULSUB.U, E.MULSUB.M, E.MULSUB.C:

E.MULSUB, E.MULSUBU, E.MULSUB, E.MULSUB, C:
EnsembleinpisceReversed/minor,size,rd,rc,rb/
E.MULSUB F. E.MULSUB, C.F.:
Env.mbleinpisceReversed/ElostineResidenses

EnsymbleinplaceReversedFloatingPoint(minor,size,rd,rc,rb)
E.SUB.F.N., F.SUB.F.Z., E.SUB.F.F., E.SUB.F.C., E.SUR.F., E.SUB.F.X.
EnsembleReversedFloatingPoint(minor.op, major.size, minor.round, rd, rc, rb)
E.UNARY

E.UNYUTY:

ESUM, ESUMU, ELOGIMOST, E. LOGIMOST, U:
EnsembleUnary(unary,rd,rc)
EABS.F., EABS.F.X., E.COPY.F., COPY.F.X.
EDEFLATE.F., E.DEFLATE.F.N., E.DEFLATE.F.Z.,
E.DEFLATE.F.F., E.DEFLATE.F.C., E.DEFLATE.F.X.
E.FLOAT.F.F., E.FLOAT.F.N., E.FLOAT.F.X.
E.FLOAT.F.F., E.FLOAT.F.X., E.FLOAT.F.X.
F.INFLATE.F., E.INFLATE.F.X., E.NEG.F.X., E.REG.F.X., E.RECEST.F.X., E.RSOREST.F., E.RSOREST.F.X., E.SOR.F., E.SOR.F.N., E.SOR.F.Z., E.SOR.F.F., E.SOR.F.C., E.SOR.F.X.
E.SUM.F., E.SUM.F.N., E.SUM.F.Z., E.SUM.F.F., E.SUM.F.X., E.SUM.F.F., E.SUM.F.X., E.SUM.F.X., E.SUM.F.X., E.SUM.F.X., E.SUM.F.X., E.SUM.F.X., E.SUNK.F.X.D., E.SUNK.F.X., E.SUNK.F.X.

.....

Te , Aug 17, 1999

Instruction Set Instruction Decode

EnsembleUnaryFloatingPointfunary.op. major.size, unary.round, rd, rcl

Others

raise Reservedinstruction

endcase

others:

raise Reservedinstruction

endrase

ECONXIL, ECONXIB, ECONXIUL, ECONXIUB E.CONXIML, E.CONXIMB, E.CONXICL, E.CONXICB

ure -- 1 11 03-mais 4

 $Ensemble Convolve \textbf{Extractimmediate}; major, inst \underline{\texttt{3...2}}, size, rd, rc, rb, inst \underline{\texttt{1...0}}$ 

E.MULX E.EXTRACT, ESCALAPON.

EnsembleExtract(major,n., :\_no\_ral

E.EXTRACTI, E.EXTRACTIÚ E.MULIO, E.MULIOU, E.MULIOU;

ME + 1 11 03-mest 4

EnsembleExtractimmechate(major,inst3\_2,size\_rd,rc,rb,inst1\_0)

EMULADDXI, EMULADDXIU, EMULADDXIM, EMULADDXIC:

SIZE -- 1 11 D3-WELL 4

EnsembleExtraction.nediateinplaceimajor,inst3\_2\_size\_rd,rc,rb,inst1\_0i

EMULGALB, EMULGIC 64:

526 - 1 11 0 3-may 24

EnsembleTernary/major,size,rd,rc,rb,ra/

EMULADOFIA, EMULADOF32, EMULADOF64, EMULADOF128 E.MULSUB.F16, E.MULSUB.F12, E.MULSUB.F64, E.MULSUB.F128,

ESCALADO,F16, ESCALACU,F32, ESCALADO,F64:

EnsembleTernaryFloatingPoint|major,rd,rc,rb,ra|

W.MINORB, W.MINORL

case minor of

W.TRANSLATE.B. W.TRANSLATE.16, W.TRANSLATE.32, W.TRANSLATE.64:

size ← 1 11 03+msts.4

WideTranslate(major,size,rd,rc,rb)

W.MULMAT.R. W.MULMAT.16, W.MULMAT.32, W.MULMAT.64, W.MULMAT.U.B. W.MULMAT.U.16, W.MULMAT.U.32, W.MULMAT.U.64, W.MULMATM.R. W.MULMATM.16. W.MULMATM.32, W.M. J. MATM.64. W.MULMAT C.B. W.MULMAT.C. 16. W.MULMAT.C.32, W.MULMAT.C.64.

W.MULMAT.P.B. W.MULMAT.P.16. W.MULMAT.P.32, W.MULMAT.P.64: \* e ← 1 11 03+msts.4

WideMultiply(major,minor,size,iid,rc,rb)

W.MULMAT.F16, W.MULMAT.F.32, W.MULMAT.F64,

W.MUL.MAT.C.F16, W.MUL.MAT.C.F32, W.MUL.MAT.C.F64:

size - 1 11 03-msts 4

WideFloatingPointMultiply(major,minor,size,rd,rc,rb)

others:

endcase

W.MUL.MATXB. W.MUL.MATXL:

WideExtract(major,ra,rb,rc,rd)

W.MULMATXIR W.MULMATXIL W.MULMATXIUR W.MULMATXIUL W.MUL.MATXIM.B. W.MUL.MATXIM.L. W.MUL.MATXI.C.B. W.MUL.MATXI.C.L:

sice ← 1 11 03+msts 4

WideExtractimmediate(major,inst3\_2,size,ra,rb,rc,inst1\_0)

W.MUL.MAT.G.B. W.MU'LMAT.G.L:

WideMultiplyGalois[major,rd rc,rb,ra]

W.SWITCH.B. W.SWITCH.L:

WideSwitch[major,rd,rc,rb,ra]

others.

Tue, Aug 17, 1999

Instruction Set Instruction Decode

raise ReservedInstruction

endcase enddel

- 71 -

MicroUnity

Tue, Aug 17, 1999

Instruction Set
Always Reserved

# Always Reserved

This operation generates a reserved instruction exception.

# Operation code

ARES Always reserved

#### **Format**

ARES imm

ares(imm)



#### Description

The reserved instruction exception is raised. Software may depend upon this major operation code raising the reserved instruction exception in all implementations. The choice of operation code intentionally ensures that a branch to a zeroed memory area will raise an exception.

#### **Definition**

del AtwaysReserved as raise Reservedinstruction enddel

#### Exceptions

Reserved Instruction

# **Address**

These operations perform calculations with two general register values, placing the result in a general register.

## Operation codes

| AADD     | Address add                         |
|----------|-------------------------------------|
| AADD.O   | Address add signed check overflow   |
| AADD.U.O | Address add unsigned check overflow |
| AAND     | Address and                         |
| ANDN     | Address and not                     |
| ANAND    | Address not and                     |
| ANOR     | Address not or                      |
| AOR      | Address or                          |
| AGRN     | Address or not                      |
| AXNOR    | Address exclusive nor               |
| AXOR     | Address xor                         |

# Redundancies

| AOR rd=rc.rc      | 5 A | COPY rd=rc      | <del></del> |
|-------------------|-----|-----------------|-------------|
| ANNO rd-rc.rc     |     | COPY rd=rc      |             |
| ANAND rd=rc,rc    |     | NOT rd=rc       |             |
| ANOR rd=rc,rc     |     | VOT rd=rc       |             |
| AXNOR rd=rc.rc    |     | SET rd          |             |
| AXOR rd=rc.rc     |     | ZERO rd         |             |
| AADD rd=rc,rc     |     | HLI rd=rc.1     | <del></del> |
| AADD.O rd=rc,rc   |     | HL.I.O rd=rc.1  |             |
| AADD.U.O rd=rc,rc |     | HLI.U.O rd=rc.1 |             |

## Selection

| class      | operat    | ion |             |             | check |   |     |
|------------|-----------|-----|-------------|-------------|-------|---|-----|
| arithmetic | ADD       |     |             |             | NONE  | 0 | U.O |
| bitwise    | OR<br>NOR | AND | XOR<br>XNOR | ANDN<br>ORN |       |   | 0.0 |

Tuc, Aug 17, 1999

Instruction Set

#### **Format**

op rd=rc,rb

rd=op(rc,rb)



#### Description

The contents of registers re and rb are fetched and the specified operation is performed on these operands. The result is placed into register rd.

# Definition

```
def Addressjop.rd.rc.rbj as
    c - RegReadirc, 64)
    b - RegReadirb. 64)
    case op of
         AADD:
             a - c . b
         AADD.O:
             1 - K63 11 c) + (b63 11 b)
             if 164 ≠ 163 then
                  raise FixedPointAnthimetic
             enal
             a - 43 0
        AADD.UO
             1 - 101 11 c] + 101 11 b]
             # 64 = 0 then
                  raise FixedPointAnthmetic
             endif
             a - 63.0
        AAND:
             a - c and b
        AOR:
        AXOR
             a - c xor b:
        AANDN:
             a ← c and not b
        ANAND:
             a - not (c and b)
        ANOR:
             a ← not (c or b)
        AXNOR:
            a ← not (c xor b)
        AORN:
             a ← c or not b
   endcase
```

Sept. Str. Str. George 2 Janes & G. G.

Tue, Aug 17, 1999

Instruction Set

Regulaterd, 64, all enddef

Exceptions

Land-boom sugments

Tue, Aug 17, 1999

Instruction Set Address Compare

# Address Compare

These operations perform calculations with two general register values and generale a fixed-point arithmetic exception if the condition specified is met.

# Operation codes

| ACOMAND.E  | Address compare and equal zero         |
|------------|----------------------------------------|
| ACOMAND.NE | Address compare and not equal zero     |
| ACOME      | Address compare equal                  |
| A COM. GE  | Address compare greater equal signed   |
| A COM GE.U | Address compare greater equal unsigned |
| ACOML      | Address compare less signed            |
| ACOMLU     | Address compare less unsigned          |
| ACOMNE     | Address compare not equal              |

## **Equivalencies**

| ACOMEZ        | Address compare equal zero                |
|---------------|-------------------------------------------|
| ACOMGZ        | Address compare greater zero signed       |
| ACOMGE.Z      | Address compare greater equal zero signed |
| ACOMLZ        | Address compare less zero signed          |
| ACOMLEZ       | Address compare less equal zero signed    |
| ACOMNEZ       | Address compare not equal zero            |
| ACOMG         | Address compare greater signed            |
| ACOMG.U       | Address compare greater unsigned          |
| <b>ACOMLE</b> | Address compare less equal signed         |
| A COMLE.U     | Address compare less equal unsigned       |
| AFIX          | Address fixed point arithmetic exception  |
| ANOP          | Address no operation                      |

| ACOMEZ IC      | ← ACOMAND.E rc.rc  |
|----------------|--------------------|
| ACOMGZ rc      | ← ACOMLU rc,rc     |
| ACOMGEZ IC     | ← ACOM.GE rc.rc    |
| ACOMLZ rc      | ← ACOM.L rc,rc     |
| ACOMLEZ rc     | ← ACOM.GE.U rc.rc  |
| ACOMNEZ IC     | ← ACOMAND.NE rc,rc |
| ACOMG ISIN     | → ACOM.L rd,rc     |
| ACOMG.U rc.rd  | → ACOM.LU rd.rc    |
| ACOMLE ICITA   | → ACOM.GE rd,rc    |
| ACOMLE.U rc.rd | → ACOM.GE.U rd,rc  |
| AFIX           | ← ACOM.E 0,0       |
| ANOP           | ← ACOM.NE 0,0      |

#### Redundancies

| ACOM.E rd.rd  | ⇔ | AFIX |
|---------------|---|------|
| ACOM.NE rd,rd | ⇔ | ANOP |

#### Selection

| class      | operation  | cond           | operand |
|------------|------------|----------------|---------|
| boolean    | COMAND COM | E NE           |         |
| arithmetic | COM        | I. GE G LE     | NONE U  |
| ·          | COM        | L GE G LE E NE | Z       |

#### **Format**

ACOM.op rd.rc

acomop(rd,rc)
acomopz(rcd)

| 31  | 24   | 23 | 18 | 17 | 12 | 11_ | 6     | 5   | 0  |
|-----|------|----|----|----|----|-----|-------|-----|----|
| A.M | INOR |    | ď  |    | rc |     | op qo | ACC | MC |
|     | 8    |    | 6  |    | 6  |     | 4     | A   |    |

#### Description

The contents of registers rd and rc are fetched and the specified condition is calculated on these operands. If the specified condition is true, a fixed-point arithmetic exception is generated. This instruction generates no general register results.

```
del AddressCompare(op,rd,rc) as
     d ← RegRead(rd, 128)
     c ← RegRead(rc, 128)
     case op of
          ACOM.E:
                2 - d = C
          ACOM.NE:
                a ← d . # C
          ACOM.AND.E:
                a \leftarrow (d \text{ and } c) = 0
           ACOMAND.NE:
          a \leftarrow (d \text{ and } c) \neq 0
ACOM.L:
                a \leftarrow (rd = rc) ? (c < 0) : (d < c)
           ACOM.GE:
                a ← (rd = rc) ? (c ≥ 0) : (d ≥ c)
           ACOM.LU.
                a ← |rd = rc| 7 |c > 0| : ||0 | 1 | d| < |0 | 1 | c||
           ACOM.GE.U:
                a ← [rd = rc] ? k ≤ 0] : [[0 11 a] ≥ [0 11 c]]
     endcase
```

Tuc, Aug 17, 1999

Instruction Set Address Compare

if a then raise FixedPointArithmetic endif enddef

Exceptions

Fixed-point anthmetic

# Address Copy Immediate

This operation produces one immediate value, placing the result in a general register.

#### Operation codes

| ACOPY.I | Address copy immediate |
|---------|------------------------|
|         | A                      |

#### **Equivalencies**

| ASET  | Address set  |
|-------|--------------|
| AZERO | Address zero |

| ASET rd  | ← ACOPY.1 rd=-1 |
|----------|-----------------|
| AZERO rd | ← ACOPY.1 rd=0  |

### **Format**

## ACOPY.I rd=imm

### rd=acopyi(imm)



#### Description

An immediate value is sign-extended f.om the 18-bit imm field. The result is placed into register rd.

#### **Definition**

del AddressCopyImmediate(op,rd,imm) as

a ← [imm] 30 11 imm

RegWritefrd, 128, al

enddef

#### Exceptions

none

Tuc, Aug 17, 1999

Address Immediate

# Address Immediate

These operations perform calculations with one general register value and one immediate value, placing the result in a general register.

#### Operation codes

| AADD.I     | Address add immediate                         |
|------------|-----------------------------------------------|
| AADD.I.O   | Address add immediate signed check overflow   |
| AADD.I.U.O | Address add immediate unsigned check overflow |
| AAND.I     | Address and immediate                         |
| ANAND.I    | Address not and immediate                     |
| ANOR.I     | Address not or immediate                      |
| AOR.I      | Address or immediate                          |
| AXOR.I     | Address xor immediate                         |

#### Equivalencies

| AANDN.I | Address and not immediate |  |
|---------|---------------------------|--|
| ACOPY   | Address copy              |  |
| ANOT    | Address not               |  |
| A.ORN.I | Address or not immediate  |  |
| AXNOR.I | Address xnor immediate    |  |

| ANDN.I rd=rc.imm  | <u> </u> | AAND.I rd=rcimm   |  |
|-------------------|----------|-------------------|--|
| ACOPY rd=rc       | <b>←</b> | AORI rd=rc,0      |  |
| ANOT rd=rc        | <b>←</b> | ANOR.I rd=rc0     |  |
| AORN.I rd=rc.imm  |          | AORI rd=rc-imm    |  |
| AXNOR.I rd=rc.imm |          | AXOR.I rd=rc,-imm |  |

#### Redundancies

| AADD.I rd=rc.0     | ACOPY rd=rc |  |
|--------------------|-------------|--|
| AADD.I.O rd=rc,0   | ACOPY rd=rc |  |
| AADD.I.U.O rd=rc,0 | ACOPY rd=rc |  |
| AAND.1 rd=rc,0     | AZERO rd    |  |
| AAND.I rd=rc,-1    | ACOPY rd=rc |  |
| ANAND.1 rd=rc,0    | ASET rd     |  |
| ANAND.I rd=rc,-1   | ANOT rd=rc  |  |
| AORI rd=rc-1       | ASET rd     |  |
| ANOR.I rd=rc,-1    | AZERO Id    |  |
| AXOR.I rd=rc,0     | ACOPY Id=IC |  |
| AXOR.I rd=rc,-1    | ANOT rd=rc  |  |
|                    |             |  |

7.1

E) Fb

#### Selection

| class      | operation       | check     |
|------------|-----------------|-----------|
| arithmetic | ADD             | NONE O UO |
| bitwise    | AND OR NAND NOR | ·         |

#### **Format**

op rd=rc.imm

#### rd=op(rc,imm)

| 31 | 24 | 23 | 18 17 | 12 11 | 0   |
|----|----|----|-------|-------|-----|
|    | ор | rd | r     | c     | lmm |
|    | 8  | 6  |       |       | 12  |

#### Description

The contents of register re is fetched, and a 64-bit immediate value is sign-extended from the 12-bit imm field. The specified operation is performed on these operands. The result is placed into register rd.

```
def Addressimmediate(op,rd,rc,imm) as
    i ← imm§? 11 imm
    c - RegRead(rc. 64)
    case op of
         AANDJ:
              a - c and i
         AORJ:
              a - c or i
         ANANO.I:
              a ← c nand i
         ANORI:
              a - c nor i
         AXORJ:
              a - c xor k
         AADOJ:
              a ← c + i
         VVDDTO:
              t - (63 11 c) + (63 11 4
              # 164 # 163 then
                   raise FixedPointArithmetic
              endil
              a ← 163.0
         AADDJ.U.O:
              t ← |c63 | | c| + |i63 | | 4
              # 164 # 0 then
                   raise FixedPointArithmetic
              endil
              a ← (63..0
```

Tue, Aug 17, 1999

Instruction Set Address Immediate

endcase RegWrite(rd, 64, a) enddef

Exceptions

Fixed-point arithmetic

# Address Immediate Reversed

These operations perform calculations with one general register value and one immediate value, placing the result in a general register.

## Operation codes

| ASETAND.E.I   | Address set and equal immediate                    |
|---------------|----------------------------------------------------|
| ASET AND NE.I | Address set and not equal immediate                |
| ASET.E.I      | Address set equal immediate                        |
| ASET.GE.I     | Address set greater equal immediate signed         |
| ASET.LI       | Address set less immediate signed                  |
| ASET.NE.I     | Address set not equal immediate                    |
| ASET.GE.I.U   | Address set greater equal immediate unsigned       |
| ASET.LI.U     | Address set less immediate unsigned                |
| ASUB.I        | Address subtract immediate                         |
| ASUB.I.O      | Address subtract immediate signed check overflow   |
| ASUB.I.U.O    | Address subtract immediate unsigned check overflow |

#### **Equivalencies**

| ANEG        | Address negate                            |
|-------------|-------------------------------------------|
| A.YEG.O     | Address negate signed check overflow      |
| ASET.G.I.U  | Address set greater immediate unsigned    |
| ASET.LE.I   | Address set less equal immediate signed   |
| ASET.LE.I.U | Address set less equal immediate unsigned |

| ANEG rd=rc            | → ASUB.I rd=0,rc          |
|-----------------------|---------------------------|
| ANEG.O rd=rc          | → ASUB.I.O rd=0,rc        |
| ASET.G.I rd=Imm,rc    | → ASET.GE.I rd=imm+1,rc   |
| ASET.G.I.U rd=imm,rc  | → ASET.GE.I.U rd=Imm+1,rc |
| ASET.LE.I rd=Imm,rc   | → ASET.L.I rd=imm-1,rc    |
| ASET.LE.I.U rd=imm,rc | → ASET.L.I.U rd=Imm-1,rc  |

Tue, Aug 17, 1999

Instruction Set
Address Immediate Reversed

#### Redundancies

| ASETAND.E.I rd=rc,0   | ASET rd           |
|-----------------------|-------------------|
| ASETAND.NE.I rd=rc.0  | AZERO rd          |
| ASETAND.E.I rd=rc,-1  | ASET.EZ rd=rc     |
| ASETAND.NE.I rd=rc,-1 | ASET.NE.Z rd=rc   |
| ASET.E.I rd=rc,0      | ASET.EZ rd=rc     |
| ASET.GE.I rd=rc,0     | ⇔ ASET.GE.Z rd=rc |
| ASET.L.I rd=rc,0      | ASET.LZ rd=rc     |
| ASET.NE.I rd=rc.0     | ASET.NE.Z rd=rc   |
| ASET.GE.I.U rd=rc,0   | ASET.GE.U.Z rd=rc |
| ASET.L.I.U rd=rc,0    | ASET.LUZ rd=rc    |

#### Selection

| class      | operation  | cond      | form | type   | check                                            |
|------------|------------|-----------|------|--------|--------------------------------------------------|
| arithmetic | SUB        |           | 1    | 0.000  | - CITCON                                         |
|            |            | 1         |      | NONE U | 10                                               |
| boolean    | SETAND SET | E NE      | 1    |        | <del>                                     </del> |
|            | SET        | L GE G LE | 1    | NONE U | <del> </del>                                     |

#### **Format**

op rd=imm,rc

rd=op(imm,rc)



#### Description

The contents of register re is fetched, and a 64-bit immediate value is sign-extended from the 12-bit imm field. The specified operation is performed on these operands. The result is placed into register rd.

```
a + t6:0
          ASUBILU.O:
               t -- 1163 11 4 - 1663 11 c)
               if t<sub>64</sub> ≠ 0 then
                    raise FixedPointArithmetic
               endif
               a ← 63.0
          ASET, AND.E.I:
               a -- (1) and c) = 0164
          ASET AND NE.1:
               a - (() and c) = 0164
          ASET.E.I:
               a - 11 = c|64
          ASET.NE.I:
               a - 11 = c|64
          ASET.LI:
               a - 11 < c|64
          ASET.GE.I:
               a - 11 2 c|64
          ASET.LI.U:
          a ← ((0 11 4 < (0 11 c))64
ASET.GE.I.U:
               a ← 110 11 i) ≥ 10 11 c)|64
     ondcase
     RegWritefrd, 64, a)
eriddef
```

#### Exceptions

fixed-point anthmenc

Tuc, Aug 17, 1999

Instruction Set Address Revened

# Address Reversed

These operations perform calculations with two general register values, placing the result in a general register.

## Operation codes

| ASETAND.E   | Address set and equal zero               |
|-------------|------------------------------------------|
| ASETAND IVE | Address set and not equal zero           |
| ASET.E      | Address set equal                        |
| ASET.GE     | Address set greater equal signed         |
| ASET.GE.U   | Address set greater equal unsigned       |
| ASET.L      | Address set less signed                  |
| ASET.LU     | Address set less unsigned                |
| ASET.NE     | Address set not equal                    |
| ASUB        | Address subtract                         |
| ASUB.O      | Address subtract signed check overflow   |
| ASUB.U.O    | Address subtract unsigned check overflow |

## Equivalencies

| ASET.E.Z   | Address set equal zero                |
|------------|---------------------------------------|
| ASET.G.Z   | Address set greater zero signed       |
| ASET.GE.Z  | Address set greater equal zero signed |
| ASET.LZ    | Address set less zero signed          |
| ASET.LE.Z. | Address set less equal zero signed    |
| ASET.NE.Z  | Address set not equal zero            |
| ASET.G     | Address set greater signed            |
| ASET.G.U   | Address set greater unsigned          |
| A SET.LE   | Address set less equal signed         |
| ASET.LE.U  | Address set less equal unsigned       |

| ← ASETAND.E rd=rc,rc  |
|-----------------------|
| ← ASET.LU rd=rc,rc    |
| ← ASET.GE rd=rc,rc    |
| ← ASET.L rd=rc,rc     |
| ← ASET.GE.U rd=rc,rc  |
| ← ASETAND.NE rd=rc,rc |
| → ASET.L rd=rcrb      |
| → ASET.LU rd=rc,rb    |
| → ASET.GE rd=rc,rb    |
| → ASET.GE.U rd=rc,rb  |
|                       |

#### Redundancies<sup>®</sup>

| ASET.E rd=rc,rc  | ⇔ | ASET rd  |
|------------------|---|----------|
| ASET.NE rd=rc,rc | ⇔ | AZERO rd |

#### Selection

| class      | operation   | cond           | operand | check |
|------------|-------------|----------------|---------|-------|
| arithmetic | SU-3        | ,              |         |       |
|            |             | <u></u>        | NONE U  | 0     |
| boolean    | SET AND SET | ENE            |         |       |
|            | SET         | L GE G LE      | NONE U  |       |
|            | SET         | L GE G LE E NE | Z       |       |

#### **Format**

op rd=rb,rc

rd=op(rb,rc)
rd=opz(rcb)

| 31 24   | 23 18 | 17 12 | 11 6 | 5 0      |
|---------|-------|-------|------|----------|
| A.MINOR | rd    | rc    | (d)  | OD       |
| 8       | 6     | 6     | 6    | <u> </u> |

### **Description**

The contents of registers re and rb are fetched and the specified operation is performed on these operands. The result is placed into register rd.

1

Zeus System Architecture

Tue, Aug 17, 1999

Instruction Set Address Revened

```
a \leftarrow |||rc = rb|| ? ||b > 0|| : ||0 ||1 ||b|| < |0 ||1 ||c||^{64} ASET.GE.U.
          a ← ffrc = rh ? (b ≤ 0) : (10 11 b) ≥ 10 11 c1164
     ASUB:
          a ← b · c
     ASUB.O:
          t - (063 11 b) - (663 11 c)
          f t64 ≠ t63 then
               raise FixedPointAnthmetic
          a ← 163.0
     ASUB.U.O:
          t - 101 11 pl - 101 11 cl
          # 164 # 0 then
          raise FixedPointArithmetic endif
          a ← 163 0
endcase
RegWritefrd, 64, at
```

#### Exceptions

enddel

I used point anthmetic

# Address Shift Left Immediate Add

These operations perform calculations with two general register values, placing the result in a general register.

#### Operation codes

| ASHLIADO | Address shift left immediate add                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               |
|----------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
|          | The state of the s |

#### **Format**

ASHLIADD rd=rcrb.i

rc=op(ra,rb,i)

| 3 | 31 24   | 23 18 | 3 17. | 12 11 | 6  | 5 2     | 1 0 |
|---|---------|-------|-------|-------|----|---------|-----|
|   | A.MINOR | rd    | rc    |       | rb | ASHLIAD | sh  |
| Ī | 8       | 6     | 6     |       | 6  | 6       | 7   |

assert ∶si≤4 sn ← , !

#### Description

The contents of register rb are shifted left by the immediate amount and added to the contents of register rc. The result is placed into register rd.

#### **Definition**

del AddressShiftLeftImmediateAdd[sh,rd,rc,rb] as

c ← RegReadirc, 64)

b - RegRead(rb. 64;

a ← c + (062-sn 0 11 01+sh)

RegWritefrd, 64, al

enddel

#### Exceptions

none

Tuc, Aug 17, 1999

Instruction Set Address Shift Left Immediate Subtract

# Address Shift Left Immediate Subtract

These operations perform calculations with two general register values, placing the result in a general register.

### Operation codes

**ASHLISUB** 

Address shift left immediate subtract

#### **Format**

ASHLISUB rd=rb,i,rc

rd=op(rb,i,rc)



assert 15i54

 $sh \leftarrow i-1$ 

#### Description

The contents of register re is subtracted from the contents of register rb shifted left by the immediate amount. The result is placed into register rd.

#### **Definition**

def AddressShiftLeftImmediateSubtract(op.rd,rc,rb) as

c ← RegRead(rc, 128)

b ← RegRead(rb, 128)

a - (062-sh.0 11 01+sh) - c

RegWritefrd, 64, 2)

enddef

#### **Exceptions**

none

# Address Shift Immediate

These operations perform calculations with one general register value and one immediate value, placing the result in a general register.

## Operation codes

| ASHLI        | Address shift left immediate                         |
|--------------|------------------------------------------------------|
| ASHLLO       | Address shift left immediate signed check overflow   |
| ASHLI.U.O    | Address shift left immediate unsigned check overflow |
| <b>ASHRI</b> | Address signed shift right immediate                 |
| ASHR.I.U     | Address shift right immediate unsigned               |

### Redundancies

| ASHLI rd=rc,1    | ⇔ AADD rd=rcrc      |
|------------------|---------------------|
| ASHLLO rd=rc,1   | AADD.O rd=rcrc      |
| ASHLLU.O rd=rc,1 | ⇔ AADD.U.O rd=rc,rc |
| ASHLI rd=rc,0    | ACOPY rd=rc         |
| ASHLLO rd=rc0    | ACOPY rd=rc         |
| ASHLLU.O rd=rc,0 | ACOPY rd=rc         |
| ASHRI rd=rc,0    | ⇔ ACOPY rd=rc       |
| ASHRILU rd=rc.0  | ACOPY rd=rc         |
|                  |                     |

#### Selection

| class | operation | form | operand | check     |
|-------|-----------|------|---------|-----------|
| shift | SHL       | 1    |         | - Circuit |
|       |           |      | NONEU   | 0         |
|       | SHR       | 1    | NONEU   |           |

#### **Format**

op rd=rc.simm

#### rd=op(rc.simm)

| 31 24   | 23 1 | 8 17 | 12 11 | 6 5 O |
|---------|------|------|-------|-------|
| A.MINOR | rd   | rc   | simm  | 00    |
| 8       | 6    | 6    | 6     | 6     |

#### Description

The contents of register re is fetched, and a 6-bit immediate value is taken from the 6-bit simm field. The specified operation is performed on these operands. The result is placed into register rd.

Tuc, Aug 17, 1999

Instruction Set

#### **Definition**

```
def AddressShiftImmediate(op,rd,rc,simm) as
     c - RegRead(rc. 64)
     case op of
          ASHLE
               a ← C63-simm.0 11 Osimm
          ASHLLO:
               if C63..63-simm ≠ C63mm+1 then
                 raise FixedPointAnthmetic
               endif
               a ← C63-simm.0 11 Osimm
          ASHLI.U.O:
               if C63..64-simm ≠ 0 then
                   raise FixedPointArithmetic
               endil
               a ← c<sub>63-simm.0</sub> 11 Osimm
          ASHRI:
               a ← assmm 11 c63.simm
          ASHRILU:
              a ← O<sup>simm</sup> 11 C63..simm
    endcase
    RegWritefrd, 64, aj
enddel
```

#### Exceptions

Fixed-point anthmetic

# Address Ternary

These operations perform calculations with three general register values, placing the result in a fourth general register.

#### Operation codes

AMUX Address multiplex

#### **Format**

op ra=rd,rc,rb

ra=amux(rd,rc,rb)



#### Description

The contents of registers rd, rc, and rb are fetched. The specified operation is performed on these operands. The result is placed into register ra.

```
del AddressTernaryjop.rd.rc.rb.raj as
d ← RegReadird, 64)
c ← RegReadirc, 64)
b ← RegReadirb, 64)
endcase
case op of
AMUX:
a ← (c and d) or (b and not d)
endcase
RegWriteira, 64, a)
enddel

Exceptions
```

Tue, Aug 17, 1999

Instruction Set

# **Branch**

This operation branches to a location specified by a register.

#### Operation codes

B Branch

#### **Format**

B rd

| 31      | 24 23 | 18 17 | 12 11 | 65       | 0          |
|---------|-------|-------|-------|----------|------------|
| B.MINOR | re    | 0     |       |          | 8          |
| 8       | 6     | 6     | 6     | <u> </u> | ₹ <b>-</b> |

#### Description

Execution branches to the address specified by the contents of register rd.

Access disallowed exception occurs if the contents of register rd is not aligned on a quadlet boundary.

#### **Definition**

```
def Branch(rd,rc,rb) as

if (rc ≠ 0) or (rb ≠ 0) then
raise Reservedinstruction
endif

d ← RegRead(rd, 64)

if (d<sub>1</sub>_0) ≠ 0 then
raise AccessDisallowedBy\/\text{irtual\}\/ddress
endif

ProgramCounter ← d<sub>63_2</sub> 11 0<sup>2</sup>
raise TakenBranch
enddef
```

#### Exceptions

Reserved Instruction
Access disallowed by virtual address

# Branch Back

This operation branches to a location specified by the previous contents of register 0, reduces the current privilege level, loads a value from memory, and restores register 0 to the value saved on a previous exception.

#### Operation codes

| B.BACK        | · · · · · · · · · · · · · · · · · · · | Branc | h back |    |    |      |     |
|---------------|---------------------------------------|-------|--------|----|----|------|-----|
| <u>Format</u> |                                       |       |        |    |    |      |     |
| B.BACK        |                                       |       |        |    |    |      |     |
| bback()       |                                       |       |        |    |    |      |     |
| 31            |                                       | 4 23  | 18 17  | 12 | 11 | 65   | O   |
|               | B.MINOR                               | 0     |        | 0  | 0  | B.B/ | NCK |
|               | 8                                     | 6     |        | 6  | 6  |      |     |

#### Description

Processor context, including program counter and privilege level is restored from register 0, where it was saved at the last exception. Exception state, if set, is cleared, re-enabling normal exception handling. The contents of register 0 saved at the last exception is restored from memory. The privilege level is only lowered, so that this instruction need not be privileged.

If the previous exception was an Access Detail exception, Continuation State set at the time of the exception affects the operation of the next instruction after this Branch Back, causing the previous Access Detail exception to be inhibited. If software is performing this instruction to abort a sequence ending in an Access Detail exception, it should abort by branching to an instruction that is not affected by Continuation State.

```
del BranchBack(rd,rc,rb) as

c ← RegRead(rc, 128)

if frd ≠ OJ orfrc ≠ OJ or {rb ≠ OJ then
raise Reservedinstruction
endif

a ← LoadMemory(ExceptionBase,ExceptionBase+Thread*128,128,U)

if Privile_rcLevel > C1..0 then
PrivilegeLevel ← C1..0
endif

ProgramCounter ← C63..2 11 0²

ExceptionState ← O
RegWrite(rd,128,a)
raise TakenBranchContinue
enddel
```

Tue, Aug 17, 1999

Instruction Set Branch Back

## Exceptions

Reserved Instruction
Access disallowed by virtual address
Access disallowed by tag
Access disallowed by global TB
Access disallowed by local TB
Access detail required by tag
Access detail required by local TB
Access detail required by global TB
Local TB mass
Global TB mass

# Branch Barrier

This operation stops the current thread until all pending stores are completed, then branches to a location specified by a register.

#### Operation codes

| B.BARRIER   |                |
|-------------|----------------|
| I B.BAKKIEK | Branch barrier |
| 0.0.0.0     | and territory  |
|             |                |

#### **Format**

**B.BARRIER** 

rd

## bbarrier(rd)



#### Description

The instruction fetch unit is directed to cease execution until all pending stores are completed. Following the barrier, any previously pre-fetched instructions are discarded and execution branches to the padress specified by the contents of register rd.

Access disallowed exerption occurs if the contents of register rd is not aligned on a quadlet boundary.

Self-modifying, dynamically-generated, or loaded code may require use of this instruction between storing the code into memory and executing the code.

#### Definition

```
del BranchBarrier(rd,rc,rb) as

if (rc ≠ 0) or (rb ≠ 0) then

raise ReservedInstruction

endif

d ← RegRead(rd, 64)

if (d<sub>1,0</sub>) ≠ 0 then

raise AccessDisallowedByVirtualAddress
endif

ProgramCounter ← d<sub>63,2</sub> 11 0<sup>2</sup>

FetrnBarner()

raise TakenBranch
enddef
```

#### Exceptions

Reserved Instruction

Tue, Aug 17, 1999

Instruction Set Branch Conditional

# Branch Conditional

These operations compare two operands, and depending on the result of that comparison, conditionally branches to a nearby code location.

### Operation codes

| BAND.E  | Branch and equal zero         |  |
|---------|-------------------------------|--|
| BAND.NE | Branch and not equal zero     |  |
| B.E     | Branch equal                  |  |
| B.GE    | Branch greater equal signed   |  |
| B.L     | Branch signed less            |  |
| B.NE    | Branch not equal              |  |
| B.GE.U  | Branch greater equal unsigned |  |
| B.L.U   | Branch less unsigned          |  |

### **Equivalencies**

| B.E.Z   | Branch equal zero                |   |
|---------|----------------------------------|---|
| B.G.Z*  | Branch greater zero signed       |   |
| B.GE.Z' | Branch greater equal zero signed |   |
| B.L.Z   | Branch less zero signed          |   |
| BLEZ"   | Branch less equal zero signed    |   |
| B.NE.Z  | Branch not equal zero            |   |
| B.LE    | Branch less equal signed         |   |
| B.G     | Branch greater signed            |   |
| B.LE.U  | Branch less equal unsigned       |   |
| B.G.U   | Branch greater unsigned          | · |
| B.NOP   | Branch no operation              |   |

<sup>48.</sup>G.Z is encoded as B.L.U with both instruction fields rd and rc equal.

<sup>&</sup>lt;sup>7</sup>B GEZ is encoded as B.GE with both instruction fields rd and rc equal.

<sup>\*</sup>B L.Z is encoded as B.J. with both instruction fields rd and rc equal.

<sup>&</sup>quot;B.I.I.L.Z is encoded as B.GE.U with both instruction fields rd and rc equal.

| -        | BAND.E reretarget                       |  |
|----------|-----------------------------------------|--|
|          |                                         |  |
|          |                                         |  |
|          |                                         |  |
|          |                                         |  |
| <b>←</b> | BAND.NE reretarget                      |  |
| <b>→</b> | B.GE rd,rc,target                       |  |
| <b>→</b> | B.L. rd,rc,target                       |  |
| ->       | B.GE.U rd,rc,target                     |  |
|          |                                         |  |
|          |                                         |  |
|          | # # # + + + + + + + + + + + + + + + + + |  |

#### Redundancies

| B.E rc.rc,target  | ₩ | B.I target |  |
|-------------------|---|------------|--|
| B.NE rc,rc,target | ⇔ | BNOP       |  |

#### Selection

| class      | OP   |     | CO | mpare    | e |    |   | type  |
|------------|------|-----|----|----------|---|----|---|-------|
| arithmetic |      |     | L  | GE       | G | LE |   | NONEU |
| vs. zero   |      |     | L  | GE<br>NE | G | LE | E | Z     |
| bitwise    | none | AND | E  | NE       |   |    |   |       |

#### **Format**

op rd,rc,target

If (op(rd,rc)) goto target;



#### Description

The contents of registers rd and rc are compared, as specified by the op field. If the result of the comparison is true, execution branches to the address specified by the offset field. Otherwise, execution continues at the next sequential instruction.

#### **Definition**

del BranchConditionally(op,rd,rc,offset) as

d - RegRead(rd, 128)

c ← RegReadirc, 128)

case op of B.E:

```
Zeus System Architecture
```

Tuc, Aug 17, 1999

Instruction Set Branch Conditional

\_\_\_\_

none

# Branch Conditional Floating-Point

These operations compare two floating-point operands, and depending on the result of that comparison, conditionally branches to a nearby code location.

#### Operation codes

| Branch equal floating-point single         |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     |
|--------------------------------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| Branch equal floating-point double         |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     |
| Branch equal floating-point quad           |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     |
| Branch greater equal floating-point half   |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     |
| Branch greater equal floating-point single |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     |
| Branch greater equal floating-point double |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     |
| Branch greater equal floating-point quad   |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     |
| Branch less floating-point half            |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     |
| Branch less floating-point single          |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     |
| Branch less floating-point double          |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     |
| Branch less floating-point quad            |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     |
| Branch less greater floating-point half    |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     |
| Branch less greater floating-point single  |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     |
| Branch less greater floating-point double  |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     |
| Branch less greater floating-point quad    |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     |
|                                            | Branch equal floating-point quad  Branch greater equal floating-point half  Branch greater equal floating-point single  Branch greater equal floating-point double  Branch greater equal floating-point quad  Branch less floating-point half  Branch less floating-point single  Branch less floating-point double  Branch less floating-point quad  Branch less greater floating-point half  Branch less greater floating-point single  Branch less greater floating-point double |

#### **Equivalencies**

| B.LE.F. 16  | Branch less equal floating-point half   |  |
|-------------|-----------------------------------------|--|
| B.LE.F.32   | Branch less equal floating-point single |  |
| B.LE.F.64   | Branch less equal floating-point double |  |
| B.LE.F. 128 | Branch less equal floating-point quad   |  |
| BGF.16      | Branch greater floating-point half      |  |
| B.G.F.32    | Branch greater floating-point single    |  |
| B.G.F.64    | Branch greater floating-point double    |  |
| B.G.F.128   | Branch greater floating-point quad      |  |

| B.LE.F.size rc.rd,target | <b>→</b> | B.GE.F.size rd,rc,target |
|--------------------------|----------|--------------------------|
| B.G.F.size rc,rd,target  | <b>→</b> | B.L.F.size rd,rc,target  |

#### Selection

| number format  | type | compare        | size     |
|----------------|------|----------------|----------|
| rloating-point | F    | E LG L GE G LE | 16 32 64 |

Tue, Aug 17, 1999

Instruction Set Branch Condennal Housing Point

#### **Format**

op rd,rc,target

if (op/rd,rc)) goto target;



#### Description

The contents of registers re and rd are compared, as specified by the op field. If the result of the comparison is true, execution branches to the address specified by the offset field. Otherwise, execution continues at the next sequential instruction.

```
def BranchConditional(f-autingPointop,rd,rc,offset) as
    case op of
         B.E.F.16, B.LG.F.16, B.L.F.16, B.GE.F.16;
              sze ← 16
         B.E.F.32, B.LG.F.32, B.LF.32, B.GE.F.32:
              ste ← 35
         BEF.64, BLGF.64, BLF.64, BGEF.64:
              size ← 64
         B.E.F.128, B.LG.F.128, B.L.F.128, B.GE.F.128;
              size ← 128
    endcase
    d ← Fisice.RegReadird, 12P#
    c ← F(size,RegRead)rc, 128)
    v - komid, cj
    case op of
         BEF16, BEF32, BEF64, BEF128:
         a ← (V = E)
BLGF16, BLGF32, BLGF64, BLGF128:
              BLF16, BLF32, BUF64, BLF128:
              3 - N - 4
         BGEF16, BGEF32, BGEF64, BGEF128:
              → N = G| Or N = E|
     endcay.
     d a then
         ProgramCounter ← ProgramCounter + (offset §§ 11 offset 11 02)
         raise TakenBranch
    endil
enddel
Exceptions
```

Branch Conditional Visibility Floating Point

# Branch Conditional Visibility Floating-Point

These operations compare two group-floating-point operands, and depending on the result of that comparison, conditionally branches to a nearby code location.

#### Operation codes

| B.I.F.32  | Branch invisible floating-point single     |
|-----------|--------------------------------------------|
| B.NI.F.32 | Branch not invisible floating-point single |
| B.NV.F.32 | Branch not visible floating-point single   |
| B.V.F.32  | Branch visible floating-point single       |

#### Selection

| number format  | type compare |           | size |
|----------------|--------------|-----------|------|
| floating-point | F            | I NI NV V | 32   |

#### **Format**

#### op rc,rd,target

## if (op(rc,rd)) goto target;



#### Description

The contents of registers re and rd are compared, as specified by the op field. If the result of the comparison is true, execution branches to the address specified by the offset field. Otherwise execution continues at the next sequential instruction.

Each operand is assumed to represent a vertex of the form: [w|x|y|x] packed into a single register. The comparisons check for visibility of a line connecting the vertices against a standard viewing volume, defined by the planes:  $x=w_1x=-w_2y=w_2y=-w_2z=0$ , z=1. A line is visible (V) if the vertices are both within the volume. A line is not visible (NV) is either vertex is outside the volume - in such a case, the line may be partially visible. A line is invisible (I) if the vertices are both outside any face of the volume. A line is not invisible (NI) if the vertices are not both outside - ny face of the volume.

#### **Definition**

del nia) as ja.t=QNVN) or ja.t=SNVN) enddel

def lessja,b) as fcom(a,b)=L enddef

del traya,b,c,d) as [fcomfabs[a],b]=G] and [fcomfabs[c],c]=G] and [a.s=c,s] enddel

```
Zeus System Architecture
                                       Tuc, Aug 17, 1999
                                                                                  Instruction Set
                                                                                 shty Floating-Point
del BranchConditionalVIsibilityFloatingPoint(op.rd,rc,offset) as
      d ← RegRead(rd, 128)
      c - RegReadirc, 128)
      dx 1- F(32,d31,d)
     cx \leftarrow F(32,c_{31.0})
      dy - F(32,d63_32)
     cy - F(32,c63 32)
     dz - F132,d95,64)
     cz - F(32,c95.64)
      dw ← F(32,d127,.46)
     cw - F(32,c127..96)
     f1 ← F(32,0x7f000000) // floating-point 1.0
     if initial or vitally or nitial or nitial or nitial or nitial or nitial or nitial shem
           a - laise
     cise
           dv ← less(fabs(dx),dz) and less(fabs(dy),dz) and less(dz,f1) and (dz,⊷ ?**
           cv - less(labs(cx),cx) and less(labs)cy),cx) and less(cx,f1) and (cx,s-):
           trz - pess(1),dz) and less(1),cz() or ((dz.s=1 and cz.s=1))
           tr - trayidu.dz.cu.czi or trayidy.dz.cy.czi or trz
           case op of
                B.I.F.32:
                B.NI.F.32:
                     a 4- not W
                B.NV.F.32:
                     > - not job and cvj
                B.V.F.32:
          endcase
     endif
     if a then
           ProgramCounter - ProgramCounter + (offset $9.11 offset 11.02)
          raise TakenBranch
     endil
enddel
Exceptions
Lone
```

# Branch Down

This operation branches to a location specified by a register, reducing the current printinge level

#### Operation codes

|         | المراج والمستقل والمستقل والمستقل المستقل المستقل والمستقل والمستق |  |
|---------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|--|
| BOOWN   | 1 Branch down                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  |  |
| LUCOWIA | ORBIKII GOWII                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  |  |
|         |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                |  |

#### **Format**

B.DOWN rd

#### bdown(rd)



#### Description

Execution branches to the address specified by the contents of register rd. The current privilege level is reduced to the level specified by the low order two bits of the contents of register rd.

#### **Definition**

```
del BranchDownfrd,rc,rbj as

if |rc = 0| or |rb = 0| then
raisc Reservadinstruction
endif

d ← RegRead(rd, 64)
if PrivilegeLevel > d1 0 then
PrivilegeLevel ← d1 0
endif
FrogramCounter ← d63 2 11 0²
isise TakeisBranch
enddef
```

Exce

Reserved Instruction

Tue, Aug 17, 1999

Instruction Set Branch Gateway

# Branch Gateway

This operation provides a secure means to call a procedure, including those at a higher privilege level.

#### Operation codes

|                        | و مساور و المساور و                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  |
|------------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| I B.GATE               | Deamah automini                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                |
| IDUCIE                 | Branch gateway                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                 |
| 0.0                    | 1 minutes of the state of the s |
| والمستدامي والمستدامات |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                |

#### **Equivalencies**

B.GATE 

B.G.ATE 0

#### **Eormat**

B.GATE- rb

#### bgate(rb)

| 31     | 24 23 | 18 | 17 12 | 11 6 | 5 0    |
|--------|-------|----|-------|------|--------|
| B.MINO | R     | 0  | 1     | rb   | B.GATE |
| 8      |       | 6  | 6     |      | 6      |

#### Description

The contents of register rb is a branch address in the high-order 6? hits and a new privilege level in the low-order 2 bits. A branch and link occurs to the branch address, and the privilege level is raised to the new privilege level. The high-order 62 bits of the successor to the current program counter is catenated with the 2-bit current execution privilege and placed in register 0.

If the new privilege level is greater than the current privilege level, an octlet of memory data is fetched from the address specified by register 1, using the hitle-endian byte order and a gateway access type. A GatewayDisallowed exception oc urs if the original contents of register 0 do not equal the memory data.

If the new privilege level is the same as the current privilege level, no checking of register I is performed.

An AccessDisallowed exception occurs if the new privilege level is greater than the privilege level required to write the memory data, or if the old privilege level is lower than the privilege required to access the memory data as a gateway, or if the access is not aligned on an 8-byte boundary.

A ReservedInstruction exception occurs if the re field is not one or the rd field is not zero.

In the example below, a gateway from level 0 to level 2 is illustrated. The gateway pointer, located by the contents of register re (1), is fetched from memory and compared against the

contents of register rb (0). The instruction may only complete if these values are equal. Concurrently, the contents of register rb (0) is placed in the program counter and privilege level, and the address of the next sequential address and privilege level is placed into register rd (0). Code at the ranget of the gateway locates the data pointer at an offset from the gateway pointer (register 1), and fetches it into register 1, making a data region available. A stack pointer may be saved and fetched using the data region, another region located from the data region, or a data region located as an offset from the original gateway pointer.



Branch gateway

For additional information on the branch-gateway instruction, see the <u>System and Privileged Library Calls</u> section on page 44.

This instruction gives the target procedure the assurances that register 0 contains a valid return address and privilege level, that register 1 points to the gateway location, and that the gateway location is octlet aligned. Register 1 can then be used to

Tue, Aug 17, 1999.

Instruction Set Pranch Gaseway

securely reach values in memory. If no sharing of literal pools is desired, register 1 may be used as a literal pool pointer directly. If sharing of literal pools is desired, register I may be used with an appropriate offset to load a new literal pool pointer; for example, with a one cache line offset from the register 1. Nove that because the virtual memory system operates with cache line granularity, that several gateway locations must be created together.

Software must ensure that an attempt to use any octar within the region designated by virtual memory as gateway cither functions properly or causes a legitimate exception. For example, if the adjacent octlets contain politiers to literal pool locations, software should ensure that these literal pools are not executable, or that by virtue of being aligned addresses, carnot raise the execution privilege level. If register 1 is used directly as a literal pool location, software must ensure that the literal pool locations that are accessible as a gammay do not lead to a security violation.

Register 0 contains a valid return address and privilege level, the value is suitable for use directly in the Branch-down (B.DOWN) instruction to return to the gateway callee.

#### Definition

```
del BranchGateway(rd.rc.rt) as
     c - RegReadirc. 64)
     b - RegReadirb. 64)
     # frd = 0| or frc = 1) there
          raise ReservedInstruction
     endif
     # c2.0 # 0 then
          raise AccessDisallowedByV/rtualAddress
     endif
     d ← ProgramCounter63_2+1 11 PrivilegeLevel
     if PrivilegeLevel < bi a then
          m \leftarrow LoadMemoryG(c,c,64,L)
          d b = m then
               raise GatewayDisallowed
          Privilegelevel - Di o
     ProgramCounter \leftarrow b<sub>63..2</sub> 11 0<sup>2</sup>
     RegWritefral, 64, di
     raise TakenBranch
enddel
Exceptions
```

Reserved Instruction Gateway desallowed Access disallowed by vertual address Access disallowed by tag Access disallowed by global TH Access disallowed by local TB Access detail required by tag

Tue, Aug 17, 1999

Instruction Set Branch Gateway

Access detail required by local TB Access detail required by global TB Local TB mass Global TB mass

Tue, Aug 17, 1999

Instruction Set Branch Fiale

# Branch Halt

This operation stops the current thread until an exception occurs.

#### Operation codes

| B.HALT | Branch halt |
|--------|-------------|
|        |             |

#### **Format**

B.HALT

#### bhalt()



#### Description

This instruction directs the instruction fetch unit to cease execution until an exception occurs.

#### **Definition**

```
def BranchHaltfrd,rc,rbj as

if (rd # 0) or (rc - -, or (rb # 0) then
raise ReservedInstruction
endif
FetchHalt()
enddef
```

#### Exceptions

Reserved Instruction

# **Branch Hint**

This operation indicates a future branch location specified by a register

### Operation codes

| DANAGE |             | Market Street, |
|--------|-------------|----------------------------------------------------------------------------------------------------------------|
| B.HINT | Branch Hint | ,                                                                                                              |
|        |             |                                                                                                                |

#### **Format**

B.HINT badd, countrd

### bhint(badd,count.rd)

| 31     | 24 23 |    | 18 17 | 12 11 |      | <b>5</b> | 0 |
|--------|-------|----|-------|-------|------|----------|---|
| B.MINO | R     | rd | COL   | ınt   | simm | B.HINT   |   |
| 8      |       | 6  | 6     |       | 6    | 6        |   |

simm ← badd-pc-4

#### Description

Trus instruction directs the instruction fetch unit of the processor that a branch is likely to occur count times at simm instructions following the current successor instruction to the address specified by the contents of register rd.

After brain-ling count times, the instruction fetch unit should presume that the branch at sirrum instructions following the current successor instruction is not likely to occur. If count is zero, this hint directs the instruction fetch unit that the branch is likely to occur more than 63 times.

Access disallowed exception occurs if the contents of register rd is not aligned on a quadlet boundary

#### **Definition**

def BranchHintird,count,simm) as

d - RegReadtrd, 64)

# (d) of # 0 then

raise AccessDisallowedByVirtualAddress

mod

FetchHant[ProgramCounter +4 + (0 11 smm 11  $0^2$ ),  $d_{63...2}$  11  $0^2$ , count] enddef

### Exceptions

Accres desallowed by virtual address

Tue, Aur 17, 1999

Instruction Set Branch Him Immediate

# Branch Hint Immediate

This operation indicates a future branch location specified as an offset from the program counter.

#### Operation codes

B.HINT.I

Branch Hint Immediate

#### **Format**

B.HINT.I badd,count,target

bhinti(badd,count,target)



simm ← badd-pc-4

#### Description

This instruction directs the instruction fetch unit of the processor that a branch is likely to occur count times at simm instructions following the current successor instruction to the address specified by the offset field.

After branching count times, the instruction fetch unit should presume that the branch at airmm instructions following the current successor instruction is not likely to occur. If count is zero, this hint directs the instruction fetch unit that the branch is likely to occur more than 63 times.

### Definition

del BranchHintimmediatejsimm.councolfsetj as

BranchHint[ProgramCounter + 4 +  $\{0.11.$  smm  $11.0^2\}$ , count. ProgramCounter +  $\{offset\}_{1.1}^4$   $\{11.$  offset  $\{1.0^2\}$ 

enddef

**Exceptions** 

000

# Branch Immediate

This operation branches to a location that is specified as an offset from the program counter.

### **Operation** codes

| B.I | Branch immediate |
|-----|------------------|
|     |                  |

#### Redundancies

| B.I target | DE se re toront    |
|------------|--------------------|
| jui wyci   | ⇔ B.E rc.rc.target |
|            |                    |

#### **Format**

B.I target

# bi(target)



### Description

Execution branches to the address specified by the offset field.

### Definition

#### Exceptions

non

Tuc, Aug 17, 1999

Instruction Set Branch Immeriate Lank

# Branch Immediate Link

This operation branches to a location that is specified as an offset from the program counter, saving the value of the program counter into register 0.

### Operation codes

b.LINK.I Branch immediate link

#### **Format**

**B.LINKI** target

blinki(target)



#### Description

The address of the instruction following this one is placed into register 0. Execution branches to the address specified by the offset field.

#### Definition

def BranchimmediateLinkloffsetj as

RegWrite(0, 64, ProgramCounter + 4)

ProgramCounter --- ProgramCounter + (offset) 11 offset 11 02)

raise TakenBranch
enddel

#### Exceptions

none

# Branch Link

This operation branches to a location specified by a register, saving the value of the program counter into a register.

#### Operation codes

| B.LINK | Branch link |  |
|--------|-------------|--|
|        |             |  |

#### Equivalencies

| <b>B.LINK</b> | <b></b>  | B.LINK 0=0  |
|---------------|----------|-------------|
| B.LINK rc     | <b>←</b> | B.LINK O=rc |

#### **Format**

#### BLUNK rd=rc



 $rb \leftarrow 0$ 

#### Description

The address of the instruction following this one is placed into register rd. Execution branches to the address specified by the contents of register re.

Access disallowed exception occurs if the contents of register re is not aligned on a quadlet boundary.

Reserved instruction exception occurs if rb is not zero.

#### Definition

```
del BranchLink(rd,rc,rb) as

if rb ≠ 0 then

raise ReservedInstruction
endif

c ← RegRead(rc, 64)

if [c and 3] ≠ 0 then

raise AccessDisallowedByVirtualAddress
endif

RegWrite(rd, 64, ProgramCounter + 4)

ProgramCounter ← c<sub>63 2</sub> 11 0<sup>2</sup>

raise TakenBranch
enddef
```

Tue, Aug 17, 1999

Instruction Set Branch Link

## Exceptions

Reserved Instruction
Access disallowed by virtual address

# Load

These operations compute a virtual address from the contents of two registers, load data from memory, sign- or zero-extending the data to fill the destination register.

#### Operation codes

| L810                  | Load signed byte                            |
|-----------------------|---------------------------------------------|
| L16.B                 | Load signed doublet big-endian              |
| L.16AB                | Load signed doublet aligned big-endian      |
| L. 16.L               | Load signed doublet little-endian           |
| L.16AL                | Load signed doublet aligned little-endian   |
| L32.B                 | Load signed quadlet big-endian              |
| L32AB                 | Load signed quadlet aligned big-endian      |
| L32.L                 | Load signed quadlet little-endian           |
| L32AL                 | Load signed quadlet aligned little-endian   |
| L64.B                 | Load signed octlet big-endian               |
| L64AB                 | Load signed octlet aligned big-endian       |
| L.64.L                | Load signed octlet little-endian            |
| L64AL                 | Load signed octlet aligned little-endlan    |
| L.128.B11             | Load hexlet big-endian                      |
| L128AB12              | Load hexlet aligned big-endian              |
| L.128.L <sup>13</sup> | Load hexlet little-endian                   |
| L128AL14              | Load hexlet aligned little-endian           |
| LU.8 <sup>15</sup>    | Load unsigned byte                          |
| LU.16.B               | Load unsigned doublet big-endian            |
| LU.16AB               | Load unsigned doc het aligned big-endian    |
| LU.16L                | Load unsigned doublet little-endian         |
| LU.16AL               | Load unsigned doublet aligned little-endian |
| LU.32.B               | Load unsigned quadlet big-endian            |
| LU32AB                | Load unsigned quadlet aligned big-endian    |
| LU.32.L               | Load unsigned quadlet little-endian         |
| LU32AL                | Load unsigned quadlet aligned little-endian |
| LU.64.B               | Load unsigned octlet big-endian             |
| LU.64AB               | Load unsigned octlet aligned big-endian     |
| LU.64.L               | Load unsigned octlet little-endian          |
| L.U.64AL              | Load unsigned octlet aligned little-endian  |

<sup>&</sup>lt;sup>14</sup>I.-8 need not distinguish between little endian and big endian ordering, nor between aligned and unaligned, as only a single byte is loaded.

<sup>111, 128.</sup>B need not distinguish between signed and unsigned, as the hexlet fills the destination register.

<sup>121-128.</sup>AB need not distinguish between signed and unsigned, as the healet fills the destination register.

<sup>131, 128.1,</sup> need not distinguish between signed and unsigned, as the healet fills the destination register.

<sup>141, 128.</sup>A1, need not distinguish between signed and unsigned, as the healet fills the destination register.

<sup>151-</sup>U8 need not distinguish between little endian and big-endian ordering, nor between aligned and unaligned, as only a single byte is loaded.

Tuc, Aug 17, 1999

Instruction Set

#### Selection:

| number format            | type | size      | alignment | order | ing |
|--------------------------|------|-----------|-----------|-------|-----|
| signed byte              |      | 8         |           |       |     |
| unsigned byte            | U    | 8         |           | 1     |     |
| signed integer           |      | 16 32 64  |           | L     | В   |
| signed integer aligned   |      | 116 32 64 | ٨         | L     | В   |
| unsigned integer         | U    | 16 32 64  |           | L     | В   |
| unsigned integer aligned | U    | 16 32 64  | ٨         | L     | В   |
| register                 |      | 128       |           | IL    | В   |
| register aligned         |      | 128       | Λ         | L     | 8   |

#### **Format**

op rd=rc,rb

rd=op(rc,rb)



#### Description

An operand size, expressed in bytes, is specified by the instruction. A virtual address is computed from the sum of the contents of register re and the contents of register rb multiplied by operand size. The contents of memory using the specified byte order are read, treated as the size specified, zero-extended or sign-extended as specified, and placed into register rd.

If alignment is specified, the computed virtual address must be aligned, that is, it must be an exact multiple of the size expressed in bytes. If the address is not aligned an "access disallowed by virtual address" exception occurs.

#### **Definition**

```
size ← 16
          L32L LU32L L32AL LU32AL L32B, LU32B, L32AB, LU32AB.
               size ← 32
        . L64L LU64L L64AL LU64AL L64B, LU64B, L64AB, LU64AB.
               size ← 64
          L128L L128AL L128B, L128AB:
               size ← 128
    endcase
    isize ← log(size)
    case op of
          LIGH LUIGH L321 LU321 L641 LU641 L1281
          LIGAL, LUIGAL, L32AL, LU32AL, LG4AL, LUG4AL, L128AL:
          L168, LU168, L328, LU328, L648, LU648, L1288,
          L16AB, LU16AB, L32AB, LU32AB, L64AB, LU64AB, L128AB:
               order ← B
          LB. LUB:
               order - undefined
    endcase
    c - RegRead(rc, 64)
    b - RegRead(rb, 64)
    VirtAddr ← c + (060-Isize.0 11 Obize-3)
     case op of
          L16AL, LU16AL, L32AL, LU32AL, L64AL, LU64AL, L128AL,
L16AB, LU16AB, L32AB, LU32AB, L64AB, LU64AB, L128AB:
               if |Claire-4_0 = 0 then
                    raise AccessDisallowedByMrtualAddress
               endif
          L16L LU16L L32L LU32L L64L LU64L L128L
          L168, LU168, L328, LU328, L648, LU648, L1288:
          LB. LUB:
     endcase
     m ← LoadMemory(c,VirtAddr,size,order)
     a ← (m<sub>size-1</sub> and signed)<sup>128-size</sup> | | m
     RegWritefrd, 128, at
enddel
```

### Exceptions

Acce as disallowed by virtual address Access disallowed by tag Access disallowed by global TB Access disallowed by local TB Access detail required by tag Access detail required by local TB Access detail required by global TB Local TB mass Global TB miss

Tuc, Aug 17, 1999

Instruction Set Load Immediate

# Load Immediate

These operations compute a virtual address from the contents of a register and a sign-extended immediate value, load data from memory, sign- or zero-extending the data to fill the destination register.

# Operation codes

| L.I.816                  | Load immediate signed byte                            |
|--------------------------|-------------------------------------------------------|
| L.I.16AB                 | Load immediate signed doublet aligned big-endian      |
| L.I. 16.B                | Load immediate signed doublet big-endian              |
| LI.16AL                  | Load immediate signed doublet aligned little-endian   |
| L.I. 16.L                | Load immediate signed doublet little-endian           |
| LI.32AB                  | Load immediate signed quadlet aligned big-endian      |
| L.I.32.B                 | Load immediate signed quadlet blg-endian              |
| L.I.32AL                 | Load immediate signed quadlet aligned little-endian   |
| L1.32.L                  | Load immediate signed quadlet lime-endian             |
| LI.64AB                  | Load immediate signed octlet aligned big-endia.s      |
| L.I.64.B                 | Load immediate signed octlet big-endian               |
| LI.64AL                  | Load immediate signed octlet aligned little-endian    |
| L.1.64.L                 | Load Immediate signed octlet little-endian            |
| L.I.128AB17              | Load immediate hexlet aligned big-endian              |
| L.I.128.B18              | Load immediate hexlet big-endian                      |
| L.I.128AL19              | Load immediate hexlet aligned little-endian           |
| L.I. 128.L <sup>20</sup> | Load immediate hexlet little-endian                   |
| LI.U.8 <sup>21</sup>     | Load immediate unsigned byte                          |
| LI.U.16AB                | Load immediate unsigned doublet aligned big-endian    |
| L.I.U. 16.B              | Load immediate unsigned doublet big-endian            |
| LI.U.16AL                | Load immediate unsigned doublet aligned little-endian |
| L.I.U. 16.L              | Load immediate unsigned doublet little-endian         |
| LI.U.32AB                | Load immediate unsigned quadlet aligned big-endian    |
| L.I.U.32.B               | Load immediate unsigned quadlet big-endian            |
| LI.U.32AL                | Load immediate unsigned quadlet aligned little-endian |
| L.I.U.32.L               | Load immed ite unsigned quadlet little-endian         |
| LI.U.64AB                | Load immediate unsigned octlet aligned big-endian     |
| L.I.U.64.B               | Load immediate unsigned octlet big-endian             |
| LI.U.64AL                | Load immediate unsigned octlet aligned little-endian  |
| L.I.U.64.L               | Load immediate unsigned octlet little-endian          |

<sup>&</sup>lt;sup>16</sup>1.1.8 need not distinguish between little-endian and big-endian ordering, nor between aligned and unaligned, as only a single byte is loaded.

<sup>17</sup>LI 128.AB need not distinguish between signed and unsigned, as the hexlet fills the destination register.

<sup>&</sup>lt;sup>18</sup>I.I.128.B need not distinguish between signed and unsigned, as the healet fills the destination register.

<sup>&</sup>lt;sup>19</sup>I.I.128.AL need not distinguish between signed and unsigned, as the healet fills the destination register.

<sup>3/1.1.128</sup> L need not distinguish between signed and unsigned, as the hexlet fills the destination register.

<sup>&</sup>lt;sup>21</sup>LLUB need not distinguish between little endian and big-endian ordering, nor between aligned and unaligned, as only a suigle byte is loaded.

#### रुद्राद्रपाठ्य

| number format            | type       | size     | alignment | ordering |
|--------------------------|------------|----------|-----------|----------|
| signed byte              |            | 8        |           | 1        |
| unsigned byte            | U          | 8        | T         | <b>†</b> |
| signed integer           | <b>—</b> — | 16 32 64 |           | L B      |
| signed integer aligned   |            | 16 32 64 | A         | L B      |
| unsigned integer         | U          | 16 32 64 |           | L B      |
| unsigned integer aligned | U          | 16 32 64 | <b>A</b>  | L B      |
| register                 |            | 128      |           | L B      |
| register aligned         |            | 128      | 1         | L B      |

#### **Eormat**

#### op rd=rc,offset

#### rd=op(rc,offset)



#### Description

An operand size, expressed in bytes, is specified by the instruction. A virtual address is computed from the sum of the contents of register re and the sign-extended value of the offset field, multiplied by the operand size. The contents of memory using the specified byte order are read, treated as the size specified, zero-extended or sign-extended as specified, and placed into register rd.

If alignment is specified, the computed virtual address must be aligned, that is, it must be an exact multiple of the size expressed in bytes. If the address is not aligned an "access disallowed by virtual address" exception occurs.

#### **Definition**

```
del Loadimmediatejop,rd,rc,offsetj as
    case op of
         U16L U32L U8, U16AL U32AL U16B, U32B, U16AB, U32AB
         U64L U64AL U64B, U64AB
             signed - true
         DU16L DU32L DU8. DU16AL DU32AL
         LIU16B, LIU32B, LIU16AB, LIU32AB:
         UU64L, LIU64AL, LIU64B, LIU64AB:
              signed - false
         U128L U128AL, U128B, U128AB:
             signed - undefined
    endcase
    case op of
         UB, LIUB:
             uze - 8
         LITAL LILIAL LILIAN, LILIAB, LILIAB, LILIAB, LILIAAB, LILIAAB.
             sic ← 16
```

Tue, Aug 17, 1999

Instruction Set

```
LI32L, LIU32L, LI32AL, LIU32AL, LI32B, LIU32B, LI32AB, LIU32AB.
         LIGAL, LIUGAL, LIGAAL, LIUGAAL, LIGABTLIUGAB, LIGAAB, LIUGAAB:
              size ← 64
         L1128L, L1128AL, L1128B, L1128AB:
              size ← 128
    endcase
    Isize ← log(size)
    cuse op of
         LITEL LIUTEL, LIBZL, LIUBZL, LIEEL LIUEEL LITZRL
         LITEAL LIUTEAL LIBZAL LIUBZAL, LIEBAL, LIUEBAL, LITZBAL;
              order - U
         LI16B, UU16B, LI32B, UU32B, LI64B, LIU64B, LI12BB,
         LI16AB, LIU16AB, LI32AB, LIU32AB, LI64AB, LIU64AB, LI12BAR
              crder ← B
         U8. UU8:
              order - undefined
    endcase
    c ← RegRead(rc, 64)
    VirtAddr ← c + 'ffset$$-1size 11 offset 11 01size-3)
    case op of
         LITEAL LIUTEAL LIBRAL LIUSZAL LIEGAL LIUEGA, LITZBAL
         LI16AB, LIU16AB, LI32AB, LIU32AB, LI64AB, LIU64AB, LI128AB:
              # (c_{1912e-4.0} = 0) then
                   raise AccessDisallowedByVirtualAddress
         LITEL LIUTEL LIJZL LIU3ZL LIETL LIUETL LITZBL
         LI16B, LIU16B, LI32B, LIU32B, LI64B, LIU64B, LI128B-
         LIB, LIUB
    endcase
    m - LoadMemory(c,VirtAddr,size,order)
    a ← (muze-1 and signed) 128-size 11 m
    RegWritefrd, 128, a)
enddel
```

#### Exceptions

Access disallowed by virtual address Access disallowed by tag Access disallowed by global TB Access disallowed by local TB Access detail required by tag Access detail required by local TB Access detail required by global Tb Local TB miss Global TB miss